



PTO/SB/17p (11-04)

Approved for use through 07/31/2007. OMB 0651-0031

U.S. Patent and Trademark Office; U.S. DEPARTMENT OF COMMERCE

Under the Paperwork Reduction Act of 1995, no persons are required to respond to a collection of information unless it displays a valid OMB control number.

DAC  
JRW

**PETITION FEE  
Under 37 CFR 1.17(f), (g) & (h)  
TRANSMITTAL**  
(Fees are subject to annual revision)

Send completed form to: Commissioner for Patents  
P.O. Box 1450, Alexandria, VA 22313-1450

|                        |                          |
|------------------------|--------------------------|
| Application Number     | 09/223,046               |
| Filing Date            | December 30, 1998        |
| First Named Inventor   | Timothy J. Van Hook      |
| Art Unit               | 2154                     |
| Examiner Name          | Donaghue, Larry D.       |
| Attorney Docket Number | 0056.10US (1778.0110001) |

Enclosed is a petition filed under 37 CFR 1.47(a) that requires a processing fee (37 CFR 1.17(f), (g), or (h)). Payment of \$ 200.00 is enclosed.

This form should be included with the above-mentioned petition and faxed or mailed to the Office using the appropriate Mail Stop (e.g., Mail Stop Petition), if applicable. For transmittal of processing fees under 37 CFR 1.17(l), see form PTO/SB/17l.

**Payment of Fees** (small entity amounts are NOT available for the petition fees)

- The Commissioner is hereby authorized to charge the following fees to Deposit Account No. 19-0036:  
 petition fee under 37 CFR 1.17(f), (g) or (h)     any deficiency of fees and credit of any overpayments  
 Enclose a duplicative copy of this form for fee processing.
- Check in the amount of \$ \_\_\_\_\_ is enclosed.
- Payment by credit card (Form PTO-2038 or equivalent enclosed). Do not provide credit card information on this form.

**Petition Fees under 37 CFR 1.17(f): Fee \$400 Fee Code 1462**

For petitions filed under:

- § 1.53(e) - to accord a filing date.
- § 1.57(a) - to accord a filing date.
- § 1.182 - for decision on a question not specifically provided for.
- § 1.183 - to suspend the rules.
- § 1.378(e) - for reconsideration of decision on petition refusing to accept delayed payment of maintenance fee in an expired patent.
- § 1.741(b) - to accord a filing date to an application under § 1.740 for extension of a patent term.

**Petition Fees under 37 CFR 1.17(g): Fee \$200 Fee Code 1463**

For petitions filed under:

- § 1.12 - for access to an assignment record.
- § 1.14 - for access to an application.
- § 1.47 - for filing by other than all the inventors or a person not the inventor.
- § 1.59 - for expungement of information.
- § 1.103(a) - to suspend action in an application.
- § 1.136(b) - for review of a request for extension of time when the provisions of section 1.136(a) are not available.
- § 1.295 - for review of refusal to publish a statutory invention registration.
- § 1.296 - to withdraw a request for publication of a statutory invention registration filed on or after the date the notice of intent to publish issued.
- § 1.377 - for review of decision refusing to accept and record payment of a maintenance fee filed prior to expiration of a patent.
- § 1.550(c) - for patent owner requests for extension of time in ex parte reexamination proceedings.
- § 1.956 - for patent owner requests for extension of time in inter partes reexamination proceedings.
- § 5.12 - for expedited handling of a foreign filing license.
- § 5.15 - for changing the scope of a license.
- § 5.25 - for retroactive license.

**Petition Fees under 37 CFR 1.17(h): Fee \$130 Fee Code 1464**

For petitions filed under:

- § 1.19(g) - to request documents in a form other than that provided in this part.
- § 1.84 - for accepting color drawings or photographs.
- § 1.91 - for entry of a model or exhibit.
- § 1.102(d) - to make an application special.
- § 1.138(c) - to expressly abandon an application to avoid publication.
- § 1.313 - to withdraw an application from issue.
- § 1.314 - to defer issuance of a patent.

  
Signature

Donald J. Featherstone

Typed or printed name

5/18/05  
Date

33,876

Registration No., if applicable

This collection of information is required by 37 CFR 1.17. The information is required to obtain or retain a benefit by the public which is to file (and by the USPTO to process) an application. Confidentiality is governed by 35 U.S.C. 122 and 37 CFR 1.11 and 1.14. This collection is estimated to take 5 minutes to complete, including gathering, preparing, and submitting the completed application form to the USPTO. Time will vary depending upon the individual case. Any comments on the amount of time you require to complete this form and/or suggestions for reducing this burden, should be sent to the Chief Information Officer, U.S. Patent and Trademark Office, U.S. Department of Commerce, P.O. Box 1450, Alexandria, VA 22313-1450. DO NOT SEND FEES OR COMPLETED FORMS TO THIS ADDRESS. SEND TO: Commissioner for Patents, P.O. Box 1450, Alexandria, VA 22313-1450.

If you need assistance in completing the form, call 1-800-PTO-9199 and select option 2.



Robert Greene Sterne  
Edward J. Kessler  
Jorge A. Goldstein  
David K.S. Cornwell  
Robert W. Esmond  
Tracy-Gene G. Durkin  
Michele A. Cimbara  
Michael B. Ray  
Robert E. Sokohl  
Eric K. Steffe  
Michael Q. Lee  
Steven R. Ludwig  
John M. Covert  
Linda E. Alcorn  
Robert C. Millonig  
Donald J. Featherstone  
Timothy J. Shea, Jr.  
Lawrence B. Bugaisky  
Michael V. Messinger  
Judith U. Kim

Patrick E. Garrett  
Jeffrey T. Hevey  
Heidi L. Kraus  
Eldora L. Ellison  
Thomas C. Fiala  
Albert L. Ferro\*  
Donald R. Banowitz  
Peter A. Jackman  
Teresa A. Medler  
Jeffrey S. Weaver  
Kendrick P. Patterson  
Vincent L. Capuano  
Brian J. Del Buono  
Virgil Lee Beaston  
Theodore A. Wood  
Elizabeth J. Haanes  
Joseph S. Ostroff  
Frank R. Cottingham  
Christine M. Lhuilier

Rae Lynn P. Guest  
George S. Bardmesser  
Daniel A. Klein  
Jason D. Eisenberg  
Michael D. Specht  
Andrea J. Kamage  
Tracy L. Muller  
Jon E. Wright  
LuAnne M. DeSantis  
Ann E. Summerfield  
Aric W. Ledford  
Helene C. Carlson  
Cynthia M. Bouchez  
Timothy A. Doyle\*  
Gaby L. Longsworth  
Lori A. Gordon  
Nicole D. Detar  
Ted J. Ebersole  
Yoti C. Iyer

Laura A. Vogel  
Michael J. Manuso  
Bryan S. Wade  
Aaron L. Schwartz  
Michael G. Penn\*  
Matthew E. Kelley\*  
Shannon A. Carroll\*  
Nicole R. Kramer\*

**Registered Patent Agents:**

Karen R. Markowicz  
Nancy J. Leith  
Matthew J. Dowd  
Katrina Yujian Pei Quach  
Bryan L. Skelton  
Robert A. Schwartzman  
Teresa A. Colella  
Jeffrey S. Lundgren  
Victoria S. Rutherford

Michelle K. Holoubek  
Simon J. Elliott  
Julie A. Heider  
Mita Mukherjee  
Scott M. Woodhouse  
Christopher J. Walsh  
Liliana Di Nola-Baron  
Peter A. Socarras  
**Of Counsel**  
Kenneth C. Bass III  
Evan R. Smith  
Marvin C. Guthrie  
\*Admitted only in Maryland  
+Admitted only in Virginia  
•Practice Limited to Federal Agencies

May 18, 2005

**WRITER'S DIRECT NUMBER:**  
(202) 772-8629

**INTERNET ADDRESS:**  
DONF@SKGF.COM

Commissioner for Patents  
P.O. Box 1450  
Alexandria, VA 22313-1450

**Mail Stop Petition**  
**Art Unit 2154**

Re: U.S. Utility Patent Application  
Application No. 09/223,046; Filed: December 30, 1998  
For: **Method for Providing Extended Precision in SIMD Vector Arithmetic Operations**  
Inventors: Van Hook *et al.*  
Our Ref: 0056.10US (1778.0110001)

Sir:

In response to the Decision Refusing Status Under 37 C.F.R. 1.47(a) mailed March 18, 2005, the following documents are transmitted herewith for appropriate action by the U.S. Patent and Trademark Office:

1. Petition Fee Transmittal (PTO/SB/17p);
2. Request for Reconsideration of Petition Under 37 C.F.R. § 1.47(a);
3. Statement of Facts in Support of Filing On Behalf of Non-Signing Inventor Under 37 C.F.R. § 1.47(a), including the following:
  - a. Exhibit A, a copy of a letter sent to Timothy J. Van Hook on April 25, 2005;
  - b. Exhibit B, a copy of the FedEx Shipping Label for the package sent to Timothy J. Van Hook on April 25, 2005;
  - c. Exhibit C, a copy of a self-addressed stamped envelope sent to Timothy J. Van Hook on April 25, 2005;
  - d. Exhibit D, a copy of the original patent application specification as filed in the present application on December 30, 1998, and in parent U.S. Patent Application No. 08/947,648, on October 9, 1997;

Commissioner for Patents  
May 18, 2005  
Page 2

- e. Exhibit E, a copy of an Amendment and Reply Under 37 C.F.R. § 1.111 as filed on August 11, 2000, and sent to Timothy J. Van Hook on April 25, 2005;
  - f. Exhibit F, a copy of an Amendment Under 37 C.F.R. § 1.312 and accompanying Request to Approve Proposed Drawing Corrections as filed on December 5, 2000, and sent to Timothy J. Van Hook on April 25, 2005;
  - g. Exhibit G, a copy of an Amendment and Reply Under 37 C.F.R. § 1.111 as filed on June 14, 2002, and sent to Timothy J. Van Hook on April 25, 2005;
  - h. Exhibit H, a copy of an Amendment and Reply Under 37 C.F.R. § 1.114 as filed on October 13, 2004, and sent to Timothy J. Van Hook on April 25, 2005;
  - i. Exhibit I, a copy of the currently pending claims for the present application as sent to Timothy J. Van Hook on April 25, 2005;
  - j. Exhibit J, a Supplemental Declaration for U.S. Patent Application No. 09/223,046 as sent to Timothy J. Van Hook on April 25, 2005;
  - k. Exhibit K, a copy of an email from FedEx dated April 28, 2005, confirming delivery of the FedEx Shipment to Timothy J. Van Hook on April 28, 2005;
4. Copy of the original executed Declaration (*in four parts*) as filed in both the present application and parent U.S. Patent Application No. 08/947,648;
  5. Credit Card Payment Form (PTO-2038) for \$200.00 to cover the petition fee set forth in 37 C.F.R. § 1.17(g); and
  6. One (1) return postcard.

It is respectfully requested that the attached postcard be stamped with the date of filing of these documents, and that it be returned to our courier. In the event that an extension of time is necessary to prevent abandonment of this patent application, then such extension of time is hereby petitioned.

Commissioner for Patents  
May 18, 2005  
Page 3

The U.S. Patent and Trademark Office is hereby authorized to charge any fee deficiency, or credit any overpayment, to our Deposit Account No. 19-0036.

Respectfully submitted,

STERNE, KESSLER, GOLDSTEIN & FOX P.L.L.C.



Donald J. Featherstone  
Attorney for Applicants  
Registration No. 33,876

DJF/LMY:krd  
Enclosures

399215\_1.DOC



IN THE UNITED STATES PATENT AND TRADEMARK OFFICE

|                                                                                          |                                        |
|------------------------------------------------------------------------------------------|----------------------------------------|
| In re application of:                                                                    | Confirmation No.: 2296                 |
| Van Hook <i>et al.</i>                                                                   | Art Unit: 2154                         |
| Appl. No.: 09/223,046                                                                    | Examiner: Donaghue, Larry D.           |
| Filed: December 30, 1998                                                                 | Atty. Docket: 0056.10US (1778.0110001) |
| For: <b>Method for Providing Extended Precision in SIMD Vector Arithmetic Operations</b> |                                        |

**Request for Reconsideration of Petition Under 37 C.F.R. § 1.47(a)**

Commissioner for Patents  
P.O. Box 1450  
Alexandria, VA 22313-1450

*Mail Stop Petition*

Sir:

In response to the Decision Refusing Status Under 37 CFR 1.47(a) mailed March 18, 2005, Applicants hereby respectfully request that the petition under 37 C.F.R. § 1.47(a) filed in the above-captioned patent application be reconsidered along with the documents submitted herewith and that status under 37 C.F.R. § 1.47(a) be accorded to the present application. Accordingly, in satisfaction of the requirements set forth in 37 C.F.R. § 1.47(a) and M.P.E.P. §§ 409.03(a), (d), and (e), the following documents are filed herewith:

1. A copy of the original Declaration for Patent Application executed by inventors Henry P. Moreton, Peter Hsu, William A. Huffman, and Earl A. Killian as filed in the above-captioned patent application, in accordance with 37 C.F.R. § 1.47(a);
  
2. A Statement of Facts in Support of Filing On Behalf of Non-Signing Inventor Under 37 C.F.R. § 1.47(a) from LuAnne M. DeSantis, Esq., along with referenced Exhibits A-K; and

3. A Credit Card Payment Form (PTO-2038) for \$200.00 in payment of the petition fee set forth in 37 C.F.R. § 1.17(g) in accordance with 37 C.F.R. § 1.47.

The Decision Refusing Status Under 37 CFR 1.47(a) mailed March 18, 2005, states:

A grantable petition under 37 C.F.R. § 1.47(a) requires: (1) proof that the non-signing inventor cannot be reached or refuses to sign the oath or declaration after having been presented with the application papers (specification, claims and drawings); (2) an acceptable oath or declaration in compliance with 35 U.S.C. §§ 115 and 116; (3) the petition fee; and (4) a statement of the last known address of the non-signing inventor....

Petitioner respectfully submits that the documents filed herewith satisfy all the requirements listed above and set forth in the Decision Refusing Status Under 37 CFR 1.47(a).

***(1) Proof that the non-signing inventor cannot be reached or refuses to sign the oath or declaration after having been presented with the application papers***

A Statement of Facts in Support of Filing on Behalf of Non-Signing Inventor Under 37 C.F.R. § 1.47(a), as required by M.P.E.P. §§ 409.03(a)(B) and (d), from LuAnne M. DeSantis, Esq., is submitted herewith along with Exhibits A-K. These documents provide proof of the pertinent facts that the non-signing inventor refuses to sign the Declaration after having been presented with the application papers.

***(2) An acceptable oath or declaration in compliance with 35 U.S.C. §§ 115 and 116***

The Decision Refusing Status Under 37 CFR 1.47(a) mailed March 18, 2005, states that "a new declaration is not required." However, for the convenience of the Petitions Attorney, a copy of the original Declaration for Patent Application executed by inventors

Henry P. Moreton, Peter Hsu, William A. Huffman, and Earl A. Killian, with the signature block of the non-signing inventor Timothy J. Van Hook left blank, is provided herewith as filed in the above-captioned patent application in accordance with 37 C.F.R. § 1.47(a). Petitioner respectfully submits that the Declaration for Patent Application provided herewith should be considered as having been signed by all of the available joint inventors on behalf of the non-signing inventor in accordance with M.P.E.P. § 409.03(a)(A).

**(3) *The petition fee***

A Credit Card Payment Form (PTO-2038) for \$200.00 in payment of the petition fee set forth in 37 C.F.R. § 1.17(g) is submitted herewith in accordance with 37 C.F.R. § 1.47.

**(4) *A statement of the last known address of the non-signing inventor***

Petitioner hereby states that the last known address of the non-signing inventor Timothy J. Van Hook as required by M.P.E.P. §§ 409.03(a)(C) and (e) appears below.

224 Oakgrove Avenue  
Atherton, CA 94027

***Summary***

Petitioner respectfully submits that the documents filed herewith satisfy all the requirements in the Decision Refusing Status Under 37 CFR 1.47(a) and the requirements set forth in 37 C.F.R. § 1.47(a) and M.P.E.P. §§ 409.03(a), (d) and (e). Therefore, Petitioner requests that the present application be accorded status under 37 C.F.R. § 1.47(a).

It is believed that no extension of time is necessary. However, if an extension of time is required to prevent abandonment of the present application, then such extension is hereby petitioned and any fee required therefore is hereby authorized to be charged to our Deposit Account 19-0036.

As noted above, a Credit Card Payment Form (PTO-2038) for \$200.00 in payment of the fee for filing a petition under 37 C.F.R. § 1.47 as set forth in 37 C.F.R. § 1.17(g) is submitted herewith. The U.S. Patent and Trademark Office is hereby authorized to charge any fee deficiency, or credit any overpayment, to our Deposit Account No. 19-0036.

Respectfully submitted,

STERNE, KESSLER, GOLDSTEIN & FOX P.L.L.C.



Donald J. Featherstone  
Attorney for Applicants  
Registration No. 33,876

Date: 5/18/05

1100 New York Avenue, N.W.  
Washington, D.C. 20005-3934  
(202) 371-2600

399285\_1.DOC



IN THE UNITED STATES PATENT AND TRADEMARK OFFICE

In re application of:

Van Hook *et al.*

Appl. No.: 09/223,046

Filed: December 30, 1998

For: **Method for Providing Extended Precision in SIMD Vector Arithmetic Operations**

Confirmation No.: 2296

Art Unit: 2154

Examiner: Donaghue, Larry D.

Atty. Docket: 0056.10US (1778.0110001)

**Statement Of Facts In Support of Filing On Behalf Of Non-Signing Inventor Under 37 C.F.R. § 1.47(a)**

Commissioner for Patents  
P.O. Box 1450  
Alexandria, VA 22313-1450

*Mail Stop Petition*

Sir:

I, LuAnne M. DeSantis, Esq., hereby declare:

1. I am making this statement of facts in support of filing on behalf of a non-signing inventor under 37 C.F.R. § 1.47(a) with regard to U.S. Non-Provisional Patent Application No. 09/223,046, filed December 30, 1998 ("the '046 patent application").
2. I am employed at the law firm of Sterne, Kessler, Goldstein & Fox P.L.L.C. ("SKGF"), 1100 New York Avenue, N.W., Washington, D.C. 20005-3934.
3. Mr. Timothy J. Van Hook ("Mr. Van Hook") is an inventor named in the '046 patent application. His last known address as of April 28, 2005, is as follows:

224 Oakgrove Avenue  
Atherton, CA 94027

4. The invention disclosed and/or claimed in the above-identified patent application was made while Mr. Van Hook was employed by Silicon Graphics, Inc. ("SGI"), 2011 N. Shoreline Boulevard, Mountain View, California, 94043-1389. The '046 patent application is now assigned to MIPS Technologies, Inc. ("MIPS"). The non-signing inventor is not currently employed at either SGI or MIPS.

5. The '046 patent application is a continuation of U.S. Non-Provisional Patent Application No. 08/947,648 ("the '648 patent application"), filed October 9, 1997 (now U.S. Patent No. 5,864,703, issued January 26, 1999).
6. According to SKGF records, a Declaration was signed for the '648 patent application by all inventors except Mr. Van Hook. The Declaration was filed in both the present '046 patent application and the '648 parent application. A petition for status under rule 1.47 was filed in the '648 parent application. Although the face of the patent that issued from the '648 patent application (U.S. Patent No. 5,864,703, issued January 26, 1999) indicates that the application was accorded rule 1.47 status, a copy of a Decision granting status under rule 1.47 could not be located.
7. This Declaration issue was discovered when reviewing the '046 patent application file in preparation for filing a continuation application.
8. On the evening of June 2, 2004, I spoke with Mr. Van Hook via telephone regarding a similar issue in another commonly owned pending application in which Mr. Van Hook is named as an inventor. During this conversation, I asked if I could verify his mailing address. Mr. Van Hook verified that his mailing address was as indicated above in paragraph number 3. In addition, Mr. Van Hook stated that due to the history between his former employer and himself, he may or may not open any package sent to him on behalf of our client, and that he may or may not sign or return anything. He explained, in some detail, that a lawsuit related to intellectual property and/or trade secrets had been brought against him in the past by his former employer. The lawsuit has since been dropped. However, Mr. Van Hook stated that he would not sign anything without a release from his former employer stating that suit will never be brought against him again.
9. On April 25, 2005, SKGF sent a package via Federal Express to Mr. Van Hook that included a letter signed by Mr. Donald J. Featherstone, Esq. of SKGF, a copy of the '046 patent application, copies of subsequent amendments made to the '046 patent application, a list of the currently pending claims in the '046 patent application, a Supplemental Declaration, and a stamped self-addressed return envelope (see EXHIBITS A-J).
10. On April 28, 2005, SKGF received email confirmation from Federal Express that the package sent on April 25, 2005, to Mr. Van Hook was received and signed for by "A.JALPA" (see EXHIBIT K).
11. On May 11, 2005, I telephoned Mr. Van Hook to ask if he had any questions regarding the package sent on April 25, 2005. He confirmed

Van Hook *et al.*  
Appl. No. 09/223,046

that he had received the package, but again stated that he refuses to sign or return anything due to the conflict with his former employer. I have not received any further communications from Mr. Van Hook.

I declare that all statements made herein of my own knowledge are true and that all statements made on information from review of the file history of the patent application are believed to be true, and further that these statements were made with the knowledge that willful false statements or the like so made are punishable by fine or imprisonment or both under Section 1001 of Title 18 of the United States Code, and that such willful false statements may jeopardize the validity of the patent application or any patent issued thereon.

Respectfully submitted,

STERNE, KESSLER, GOLDSTEIN & FOX P.L.L.C.



LuAnne M. DeSantis

Date: 5/18/05

1100 New York Avenue, N.W.  
Washington, D.C. 20005-3934  
(202) 371-2600

399235\_1.DOC

## **EXHIBIT A**



ATTORNEYS AT LAW



Robert Greene Sterne  
Edward J. Kessler  
Jorge A. Goldstein  
David K.S. Cornwell  
Robert W. Esmond  
Tracy-Gene G. Durkin  
Michele A. Cimbala  
Michael B. Ray  
Robert E. Sokohl  
Eric K. Steffe  
Michael Q. Lee  
Steven R. Ludwig  
John M. Covert  
Linda E. Alcorn  
Robert C. Millonig  
Donald J. Featherstone  
Lawrence B. Bugaisky  
Michael V. Messinger  
Judith U. Kim

Timothy J. Shea, Jr.  
Patrick E. Garrett  
Jeffrey T. Helvey  
Heidi L. Kraus  
Albert L. Ferro\*  
Donald R. Banowitz  
Peter A. Jackman  
Teresa M. Medler  
Jeffrey S. Weaver  
Kendrick P. Patterson  
Vincent L. Capuano  
Eldora Ellison Floyd  
Thomas C. Fiala  
Brian J. Del Buono  
Virgil Lee Beaston  
Theodore A. Wood  
Elizabeth J. Haanes  
Joseph S. Ostroff  
Frank R. Cottingham

Christine M. Lhulier  
Rae Lynn P. Guest  
George S. Bardmesser  
Daniel A. Klein\*  
Jason D. Eisenberg  
Michael D. Specht  
Andrea J. Kamage  
Tracy L. Muller\*  
Jon E. Wright  
LuAnne M. DeSantis  
Ann E. Summerfield  
Aric W. Ledford\*  
Helene C. Carlson  
Cynthia M. Bouchez  
Timothy A. Doyle\*  
Gaby L. Longsworth  
Lori A. Gordon\*  
Nicole D. Dretar  
Ted J. Ebersole

Jyoti C. Iyer\*  
Laura A. Vogel  
Michael J. Mancuso  
Bryan S. Wade  
Aaron L. Schwartz  
Matthew E. Kelley\*  
Nicole R. Kramer  
**Registered Patent Agents**\*  
Karen R. Markowicz  
Nancy J. Leith  
Matthew J. Dowd  
Aaron L. Schwartz  
Katrina Yujian Pei Quach  
Bryan L. Skelton  
Robert A. Schwartzman  
Teresa A. Colella  
Jeffrey S. Lundgren  
Victoria S. Rutherford

Michelle K. Holoubek  
Simon J. Elliott  
Julie A. Heider  
Mita Mukherjee  
Scott M. Woodhouse  
Michael G. Penn  
Christopher J. Walsh  
Peter A. Socaras  
**Of Counsel**  
Kenneth C. Bass III  
Evan R. Smith  
Marvin C. Guthrie  
**\*Admitted only in Maryland**  
**\*Admitted only in Virginia**  
**\*Practice Limited to Federal Agencies**

April 25, 2005

**WRITER'S DIRECT NUMBER:**  
(202) 772-8629  
**INTERNET ADDRESS:**  
DONF@SKGF.COM

Timothy J. Van Hook  
224 Oakgrove Avenue  
Atherton, CA 94027

*Via Federal Express*

Re: U.S. Patent No. 5,864,703; Issued: January 26, 1999  
U.S. Patent Application No. 09/223,046; Filed December 30, 1998  
For: **Method for Providing Extended Precision in SIMD Vector Arithmetic Operations**  
Inventors: Van Hook *et al.*  
Our Refs: 1778.0110000 and 1778.0110001

Dear Mr. Van Hook:

Our law firm is handling a family of patents on which you were named as an inventor. This patent family specifically includes the following:

- U.S. Patent No. 5,864,703, issued January 26, 1999; and
- U.S. Patent Application No. 09/223,046, filed December 30, 1998.

These documents are entitled "Method for Providing Extended Precision in SIMD Vector Arithmetic Operations" and both are based on the same patent specification. The initially filed patent application was owned by Silicon Graphics, Inc. However, the original patent and the pending application are now owned by MIPS Technologies, Inc., our client.

When the initial patent application was filed, the law firm handling the filing requested status under 37 C.F.R. § 1.47. However, the request did not adequately meet the requirements set forth in Rule 1.47, which allows an application to be patented even though one or more of the inventors refuses to sign the Declaration or cannot be reached. Because the inadequate request for Rule 1.47 Status in the initial application was not discovered until recently, it has been carried through to the subsequently filed application in this patent family.

Timothy J. Van Hook  
April 25, 2005  
Page 2

We recently brought this error to the attention of the U.S. Patent and Trademark Office (USPTO). In order to correct it, the USPTO has directed us to contact you and request that you execute two new Declarations, one for each of the patent documents listed above.

Accordingly, enclosed you will find the following documents:

1. The original patent application specification, as filed on October 9, 1997 (U.S. Patent Application No. 08/947,648), and December 30, 1998 (U.S. Patent Application No. 09/223,046), on which both patent documents are based;
2. U.S. Patent No. 5,864,703, issued January 26, 1999 (from U.S. Patent Application No. 08/947,648, filed October 9, 1997);
3. Copy of an Amendment and Reply Under 37 C.F.R. § 1.111 as filed in the U.S. Patent and Trademark Office on August 11, 2000, for U.S. Patent Application No. 09/223,046;
4. Copy of an Amendment Under 37 C.F.R. § 1.312 and accompanying Request to Approve Proposed Drawing Corrections as filed in the U.S. Patent and Trademark Office on December 5, 2000, for U.S. Patent Application No. 09/223,046;
5. Copy of an Amendment and Reply Under 37 C.F.R. § 1.111 as filed in the U.S. Patent and Trademark Office on June 14, 2002, for U.S. Patent Application No. 09/223,046;
6. Copy of an Amendment and Reply Under 37 C.F.R. § 1.114 as filed in the U.S. Patent and Trademark Office on October 13, 2004, for U.S. Patent Application No. 09/223,046;
7. Copy of the currently pending claims for U.S. Patent Application No. 09/223,046;
8. A Supplemental Declaration for U.S. Patent No. 5,864,703; and
9. A Supplemental Declaration for U.S. Patent Application No. 09/223,046.

We ask that you please review these documents, with particular attention directed toward the claims of each patent document.

A Declaration for a patent application is a document that: 1) confirms each inventor's residence, mailing address, and citizenship; 2) certifies that each inventor contributed to at least one claim of the presented subject matter; 3) certifies that the specification and claims have been

Timothy J. Van Hook  
April 25, 2005  
Page 3

reviewed and are understood; and 4) certifies that each inventor acknowledges the duty to disclose information that is material to patentability.

Please carefully review the enclosed Supplemental Declarations and any information appearing thereon. Your "residence" address should be your city and state of residence, or, if you reside outside the United States, the city and country of residence. Your "mailing" address should be the (full) address at which you customarily receive mail. Either your home or business address is an acceptable mailing address. Please make any corrections, if necessary, **in blue ink**, and then **initial and date in the margin**. Once the information on the Declarations is complete and correct, and after your review of the additional documents listed on the previous page, please **sign and date** each Declaration **in blue ink** where indicated. We ask that you attend to this matter as soon as possible.

For your convenience, we have provided a self-addressed, stamped return envelope for returning the signed Supplemental Declarations to us. Or, if you prefer not to sign the enclosed Declarations, feel free to use the enclosed envelope to return them to us unsigned.

Please note that every person who signs a document submitted to the USPTO makes a certification under 37 C.F.R. § 10.18(b) and (c). Therefore, a copy of this rule is also enclosed for your review.

Because U.S. Patent Application No. 09/223,046 has not yet issued as a patent, it is our obligation to remind you that a duty of disclosure continues throughout the entire patent application process and ends only with the actual issuance of a patent. Therefore, if you have or become aware of any information that might be considered material to patentability, please forward it to us immediately.

We, along with our client, greatly appreciate your assistance with this matter, and look forward to your return of the executed Declarations. In the meantime, if you have any comments or questions regarding this matter, please do not hesitate to contact us.

Very truly yours,

STERNE, KESSLER, GOLDSTEIN & FOX P.L.L.C.



Donald J. Featherstone

DJF/LMY:krd  
Enclosures

388928\_1.DOC

## **EXHIBIT B**

From: Origin ID: (202)371-2600  
 Donald J. Featherstone  
 Steme Kessler Goldstein & Fox  
 1100 New York Avenue, NW

Washington, DC 20005



Ship Date: 25APR05  
 Actual Wgt: 1 LB  
 System#: 1162221/INET2000  
 Account#: S \*\*\*\*\*

REF: 1778.0110001



Delivery Address Bar Code

SHIP TO: (650)325-7605 BILL SENDER

Timothy J. Van Hook

224 Oakgrove Avenue

Atherton, CA 94027



### STANDARD OVERNIGHT

TUE

Deliver By:  
26APR05

TRK# 7929 0538 7783 FORM 0201

SFO A2

94027 -CA-US

XH HGTA



**Shipping Label: Your shipment is complete**

1. Use the 'Print' feature from your browser to send this page to your laser or inkjet printer.
2. Fold the printed page along the horizontal line.
3. Place label in shipping pouch and affix it to your shipment so that the barcode portion of the label can be read and scanned.

**Warning: Use only the printed original label for shipping. Using a photocopy of this label for shipping purposes is fraudulent and could result in additional billing charges, along with the cancellation of your FedEx account number.**

Use of this system constitutes your agreement to the service conditions in the current FedEx Service Guide, available on [fedex.com](http://fedex.com). FedEx will not be responsible for any claim in excess of \$100 per package, whether the result of loss, damage, delay, non-delivery, misdelivery, or misinformation, unless you declare a higher value, pay an additional charge, document your actual loss and file a timely claim. Limitations found in the current FedEx Service Guide apply. Your right to recover from FedEx for any loss, including intrinsic value of the package, loss of sales, income interest, profit, attorney's fees, costs, and other forms of damage whether direct, incidental, consequential, or special is limited to the greater of \$100 or the authorized declared value. Recovery cannot exceed actual documented loss. Maximum for items of extraordinary value is \$500, e.g. jewelry, precious metals, negotiable instruments and other items listed in our Service Guide. Written claims must be filed within strict time limits, see current FedEx Service Guide.

## **EXHIBIT C**

**Sterne, Kessler, Goldstein & Fox P.L.L.C.**

Attorneys At Law  
1100 New York Avenue, N.W.  
Washington, DC 20005-3934

STERNE, KESSLER, GOLDSTEIN & FOX P.L.L.C.  
Attn: LuAnne DeSantis  
1100 New York Avenue, N.W.  
Washington, D.C. 20005



## **EXHIBIT D**



UNITED STATES PATENT APPLICATION  
FOR

A METHOD FOR PROVIDING EXTENDED PRECISION IN SIMD VECTOR  
ARITHMETIC OPERATIONS

Inventors:

Timothy van Hook

Peter Hsu

William Huffman

Henry Moreton

Earl Killian

Prepared by:

WAGNER, MURABITO & HAO

Two North Market Street

Third Floor

San Jose, California 95113



## A METHOD FOR PROVIDING EXTENDED PRECISION IN SIMD VECTOR ARITHMETIC OPERATIONS

### FIELD OF THE INVENTION

5        The present claimed invention relates to the field of single instruction multiple data (SIMD) vector process. More particularly, the present claimed invention relates to extended precision in SIMD vector arithmetic operations.

### BACKGROUND ART

10      Today, most processors in computer systems provide a 64-bit datapath architecture. The 64-bit datapath allows operations such as read, write, add, subtract, and multiply on the entire 64 bits of data at a time. This added bandwidth has significantly improved performance of the processors.

15      However, the data types of many real world applications do not utilize the full 64 bits in data processing. For example, in digital signal processing (DSP) applications involving audio, video, and graphics data processing, the light and sound values are usually represented by data types of 8, 12, 16, or 24 bit numbers. This is because people typically are not able to distinguish the  
20     levels of light and sound beyond the levels represented by these numbers of bits. Hence, DSP applications typically require data types far less than the full 64 bits provided in the datapath in most computer systems.

25      In initial applications, the entire datapath was used to compute an image or sound values. For example, an 8 or 16 bit number representing a pixel or sound value was loaded into a 64-bit number. Then, an arithmetic operation, such as an add or multiply, was performed on the entire 64-bit

number. This method proved inefficient however, as it was soon realized that not all the data bits were being utilized in the process since digital representation of a sound or pixel requires far fewer bits. Thus, in order to utilize the entire datapath, a multitude of smaller numbers were packed into 5 the 64 bit doubleword.

Furthermore, much of data processing in DSP applications involve repetitive and parallel processing of small integer data types using loops. To take advantage of this repetitive and parallel data process, a number of today's 10 processors implements single instruction multiple data (SIMD) in the instruction architecture. For instance, the Intel Pentium MMX<sup>TM</sup> chips incorporate a set of SIMD instructions to boost multimedia performance.

Prior Art Figure 1 illustrates an exemplary single instruction multiple 15 data instruction process. Exemplary registers, vs and vt, in a processor are of 64-bit width. Each register is packed with four 16-bit data elements fetched from memory: register vs contains vs[0], vs[1], vs[2], and vs[3] and register vt contains vt[0], vt[1], vt[2], and vt[3]. The registers in essence contain a vector 20 of N elements. To add elements of matching index, an add instruction adds, independently, each of the element pairs of matching index from vs and vt. A third register, vd, of 64-bit width may be used to store the result. For example, vs[0] is added to vt[0] and its result is stored into vd[0]. Similarly, vd[1], vd[2], and vd[3] store the sum of vs and vd elements of corresponding indexes. Hence, a single add operation on the 64-bit vector results in 4 25 simultaneous additions on each of the 16-bit elements. On the other hand, if 8-bit elements were packed into the registers, one add operation performs 8 independent additions in parallel. Consequently, when a SIMD arithmetic

instruction, such as addition, subtraction, or multiply, is performed on the data in the 64-bit datapath, the operation actually performs multiple numbers of operations independently and in parallel on each of the smaller elements comprising the 64 bit datapath.

5

Unfortunately however, an arithmetic operation such as add and multiply on SIMD vectors typically increases the number of significant bits in the result. For instance, an addition of two n-bit numbers may result in a number of  $n+1$  bits. Moreover, a multiplication of two n-bit numbers 10 produces a number of  $2n$  bit width. Hence, the results of an arithmetic operation on a SIMD vector may not be accurate to a desired significant bit.

Furthermore, the nature of multimedia DSP applications often increases inaccuracies in significant bits. For example, many DSP algorithms 15 implemented in DSP applications require a series of computations producing partial results that are larger or bigger, in terms of significant number of bits, than the final result. Since the final result does not fully account for the significant bits of these partial results, the final result may not accurately reflect the ideal result, which takes into account all significant bits of the 20 intermediate results.

To recapture the full significant bits in a SIMD vector arithmetic operation, the size of the data in bits for each individual element was typically boosted or promoted to twice the size of the original data in bits. Thus, for 25 multiplication on 8-bit elements in a SIMD vector for instance, the 8-bit elements were converted (i.e., unpacked) into 16-bit elements containing 8 significant bits to provide enough space to hold the subsequent product.

Unfortunately however, the boost in the number of data bits largely undermined the benefits of SIMD vector scheme by reducing the speed of an arithmetic operation in half. This is because the boosting of data bits to twice  
5 the original size results in half as many data elements in a register. Hence, an operation on the entire 64-bit datapath comprised of 16-bit elements accomplishes only 4 operations in comparison to 8 operations on a 64-bit datapath comprised of 8-bit elements. In short, boosting a data size by X-fold results in performance reduction of  $(1/X)*100$  percent. As a result, instead of  
10 an effective 64-bit datapath, the effective datapath was only 32-bits wide.

Thus, what is needed is a method and system for providing extended precision in SIMD vector arithmetic operations without sacrificing speed and performance.

## SUMMARY OF THE INVENTION

The present invention provides extended precision in SIMD arithmetic operations in a processor having a register file and an accumulator. The register file is comprised of a plurality of general purpose registers of N bit width. The size of the accumulator is preferably an integer multiple of the size of the general purpose registers. The preferred embodiment uses registers of 64 bits and an accumulator of 192 bits. The present invention first loads, from a memory, a first set of data elements into a first vector register and a second set of data elements into a second vector register. Each data element comprises N bits. Next, an arithmetic instruction is fetched from memory and is decoded. Then, a first vector register and a second vector register are read from the register file as specified in the arithmetic instruction. The present invention then executes the arithmetic instruction on corresponding data elements in the first and second vector registers. The result of the execution is then written into the accumulator. Then, each element in the accumulator is transformed into an N-bit width element and written into a third register for further operation or storage in the memory.

In an alternative embodiment, the accumulator contains a third set of data elements. After the arithmetic operation between the data elements in the first and second vector registers, the result of the execution is added to the corresponding elements in the accumulator. The result of the addition is then written into the accumulator. Thereafter, each element in the accumulator is transformed into an N-bit width element and written into a third register for further operation or storage in the memory.

## BRIEF DESCRIPTION OF THE DRAWINGS

The accompanying drawings, which are incorporated in and form a part of this specification, illustrate embodiments of the invention and, together with the description, serve to explain the principles of the invention:

5

Prior Art Figure 1 illustrates an exemplary single instruction multiple data (SIMD) instruction method.

Figure 2 illustrates an exemplary computer system of the present invention.

10 Figure 3 illustrates a block diagram of an exemplary datapath including a SIMD vector unit (VU), a register file, and a vector load/store unit according to one embodiment of the present invention.

Figure 4 illustrates a more detailed datapath architecture including the accumulator in accordance with the present invention.

15 Figure 5 illustrates a flow diagram of general operation of an exemplary arithmetic instruction according to a preferred embodiment of the present invention.

Figure 6 illustrates element select format for 4 16-bit elements in a 64-bit register.

20 Figure 7 illustrates element select format for 8 8-bit elements in a 64-bit register.

Figure 8 illustrates an exemplary ADDA(fmt arithmetic operation between elements of exemplary operand registers vs and vt.

25 Figure 9 illustrates an exemplary ADDL(fmt arithmetic operation between elements of exemplary operand registers vs and vt.

## DESCRIPTION OF THE PREFERRED EMBODIMENTS

In the following detailed description of the present invention, numerous specific details are set forth in order to provide a thorough understanding of the present invention. However, it will be obvious to one skilled in the art that the present invention may be practiced without these specific details. In other instances well known methods, procedures, components, and circuits have not been described in detail so as not to unnecessarily obscure aspects of the present invention.

- 10        The present invention features a method for providing extended precision in single-instruction multiple-data (SIMD) arithmetic operations in a computer system. The preferred embodiment of the present invention performs integer SIMD vector arithmetic operations in a processor having 64-bit wide datapath within an exemplary computer system described above.
- 15        Extended precision in the SIMD arithmetic operations are supplied through the use of an accumulator register having a preferred width of 3 times the general purpose register width. Although a datapath of 64-bits is exemplified herein, the present invention is readily adaptable to datapaths of other variations in width.

20

## COMPUTER SYSTEM ENVIRONMENT

- Figure 2 illustrates an exemplary computer system 212 comprised of a system bus 200 for communicating information, one or more central processors 201 coupled with the bus 200 for processing information and instructions, a computer readable volatile memory unit 202 (e.g., random access memory, static RAM, dynamic RAM, etc.) coupled with the bus 200 for storing information and instructions for the central processor(s) 201, a

computer readable non-volatile memory unit (e.g., read only memory, programmable ROM, flash memory, EPROM, EEPROM, etc.) coupled with the bus 200 for storing static information and instructions for the processor(s).

- 5 Computer system 212 of Figure 2 also includes a mass storage computer readable data storage device 204 (hard drive, floppy, CD-ROM, optical drive, etc.) such as a magnetic or optical disk and disk drive coupled with the bus 200 for storing information and instructions. Optionally, system 212 can include a display device 205 coupled to the bus 200 for displaying information to the
- 10 user, an alphanumeric input device 206 including alphanumeric and function keys coupled to the bus 200 for communicating information and command selections to the central processor(s) 201, a cursor control device 207 coupled to the bus for communicating user input information and command selections to the central processor(s) 201, and a signal generating device 208
- 15 coupled to the bus 200 for communicating command selections to the processor(s) 201.

According to a preferred embodiment of the present invention, the processor(s) 201 is a SIMD vector unit which can function as a coprocessor for a host processor (not shown). The VU performs arithmetic and logical operations on individual data elements within a data word using the instruction methods described below. Data words are treated as vectors of Nx1 elements, where N can be 8, 16, 32, 64, or multiples thereof. For example, a set of Nx1 data elements of either 8- or 16-bit fields comprises a data doubleword of 64-bit width. Hence, a 64 bit wide double word contains either 25 4 16-bit elements or 8 8-bit elements.

Figure 3 illustrates a block diagram of an exemplary datapath 300 including a SIMD vector unit (VU) 302, a register file 304, a vector load/store unit 318, and crossbar circuits 314 and 316 according to one embodiment of the present invention. The VU 302 executes an operation specified in the 5 instruction on each element within a vector in parallel. The VU 302 can operate on data that is the full width of the local on-chip memories, up to 64 bits. This allows parallel operations on 8 8-bit, 4 16-bit, 2 32-bit, or 1 64-bit elements in one cycle. The VU 302 includes an accumulator 312 to hold values to be accumulated or accumulated results.

10

The vector register file is comprised of 32 64-bit general purpose registers 306 through 310. The general purpose registers 306 through 310 are visible to the programmer and can be used to store intermediate results. The preferred embodiment of the present invention uses the floating point 15 registers (FGR) of a floating point unit (FPU) as its vector registers.

In this shared arrangement, data is moved between the vector register file 304 and memory with Floating Point load and store doubleword instructions through the vector load/store unit 318. These load and store 20 operations are unformatted. That is, no format conversions are performed and therefore no floating-point exceptions can occur due to these operations. Similarly, data is moved between the vector register file 304 and the VU 302 without format conversions, and thus no floating-point exception occurs.

25 Within each register, data may be written, or read, as bytes (8-bits), short-words (16-bits), words (32-bits), or double-words (64-bits). Specifically, the vector registers of the present invention are interpreted in the following

new data formats: Quad Half (QH), Oct Byte (OB), Bi Word (BW), and Long (L). In QH format, a vector register is interpreted as having 16-bit elements. For example, a 64-bit vector register is interpreted as a vector of 4 signed 16-bit integers. OB format interprets a vector register as being comprised of 8-bit elements. Hence, an exemplary 64-bit vector register is seen as a vector of 8 unsigned 8-bit integers. In BW format, a vector register is interpreted as having 32-bit elements. L format interprets a vector register as having 64-bit elements. These data types are provided to be adaptable to various register sizes of a processor. As described above, data format conversion is not necessary between these formats and floating-point format.

With reference to Figure 3, the present invention utilizes crossbar circuits to select and route elements of a vector operand. For example, the crossbar circuit 314 allows selection of elements of a given data type and pass on the selected elements as operands to VU 302. The VU 302 performs arithmetic operations on operands comprised of elements and outputs the result to another crossbar circuit 316. This crossbar circuit 316 routes the resulting elements to corresponding element fields in registers such as vd 310 and accumulator 312. Those skilled in the art will no doubt recognize that crossbar circuits are routinely used to select and route the elements of a vector operand.

With reference to Figure 3, the present invention also provides a special register, accumulator 312, of preferably 192-bit width. This register is used to store intermediate add, subtract, or multiply results generated by one instruction with the intermediate add, subtract, or multiply results generated by either previous or subsequent instructions. The accumulator 312 can also

be loaded with a vector of elements from memory through a register. In addition, the accumulator 312 is capable for forwarding data to the VU 302, which executes arithmetic instructions. Although the accumulator 312 is shown to be included in the VU 302, those skilled in the art will recognize  
5 that it can also be placed in other parts of the datapath so as to hold either accumulated results or values to be accumulated.

Figure 4 illustrates a more detailed datapath architecture including the accumulator 312. In this datapath, the contents of two registers, vs and vt, are  
10 operated on by an ALU 402 to produce a result. The result from the ALU can be supplied as an operand to another ALU such as an adder/subtractor 404. In this datapath configuration, the accumulator 312 can forward its content to be used as the other operand to the adder/subtractor 404. In this manner, the accumulator 312 can be used as both a source and a destination in consecutive  
15 cycles without causing pipe stalls or data hazards. By thus accumulating the intermediate results in its expanded form in tandem with its ability to be used as both a source and a destination, the accumulator 312 is used to provide extended precision for SIMD arithmetic operations.

20 An exemplary accumulator of the present invention is larger in size than general purpose registers. The preferred embodiment uses 192-bit accumulator and 64-bit registers. The format of the accumulator is determined by the format of the elements accumulated. That is, the data types of an accumulator matches the data type of operands specified in an  
25 instruction. For example, if the operand register is in QH format, the accumulator is interpreted to contain 4 48-bit elements. In OB format, the accumulator is seen as having 8 24-bit elements. In addition, accumulator

elements are always signed. Elements are stored from or loaded into the accumulator indirectly to and from the main memory by staging the elements through the shared Floating Point register file.

5       Figure 5 illustrates a flow diagram of an exemplary arithmetic operation according to a preferred embodiment of the invention. In step 502, an arithmetic instruction is fetched from memory into an instruction register. Then in step 504, the instruction is decoded to determine the specific arithmetic operation, operand registers, selection of elements in operand 10 registers, and data types. The instruction opcode specifies an arithmetic operation such as add, multiply, or subtract in its opcode field. The instruction also specifies the data type of elements, which determines the width in bits and number of elements involved in the arithmetic operation. For example, OB data type format instructs the processor to interpret a vector 15 register as containing 8 8-bit elements. On the other hand, QH format directs the processor to interpret the vector register as having 4 16-bit elements.

The instruction further specifies two operand registers, a first register (vs) and a second register (vt). The instruction selects the elements of the 20 second register, vt, to be used with each element of the accumulator, and/or the first register, vs. For example, the present invention allows selection of one element from the second register to be used in an arithmetic operation with all the elements in the first register independently and in parallel. The selected element is replicated for every element in the first register. In the 25 alternative, the present invention provides selection of all elements from the second register to be used in the arithmetic operation with all the elements in the first register. The arithmetic operation operates on the corresponding

elements of the registers independently and in parallel. The present invention also provides an immediate value (i.e., a constant) in a vector field in the instruction. The immediate value is replicated for every element of the second register before an arithmetic operation is performed between the 5 first and second registers.

According to the decoded instruction, the first register and the second register with the selected elements are read for execution of the arithmetic operation in step 506. Then in step 508, the arithmetic operation encoded in 10 the instruction is executed using each pair of the corresponding elements of first register and the second register as operands. The resulting elements of the execution are written into corresponding elements in the accumulator in step 510. According to another embodiment of the present invention, the resulting elements of the execution are added to the existing values in the 15 accumulator elements. That is, the accumulator "accumulates" (i.e., adds) the resulting elements onto its existing elements. The elements in the accumulator are then transformed into N-bit width in step 512. Finally, in step 514, the transformed elements are stored into memory. The process then terminates in step 516.

20

The SIMD vector instructions according to the present invention either write all 192 bits of the accumulator or all 64 bits of an FPR, or the condition codes. Results are not stored to multiple destinations, including the condition codes.

25

Integer vector operations that write to the FPRs clamp the values being written to the target's representable range. That is, the elements are saturated

for overflows and under flows. For overflows, the values are clamped to the largest representable value. For underflows, the values are clamped to the smallest representable value.

- 5        On the other hand, integer vector operations that write to an accumulator do not clamp their values before writing, but allow underflows and overflows to wrap around the accumulator's representable range. Hence, the significant bits that otherwise would be lost are stored into the extra bits provided in the accumulator. These extra bits in the accumulator thus ensure
- 10      that unwanted overflows and underflows do not occur when writing to the accumulator or FPRs.

#### SELECTION OF VECTOR ELEMENTS

The preferred embodiment of the present invention utilizes an  
15      accumulator register and a set of vector registers in performing precision arithmetic operations. First, an exemplary vector register, vs, is used to hold a set of vector elements. A second exemplary vector register, vt, holds a selected set of vector elements for performing operations in conjunction with the elements in vector register, vs. The present invention allows an  
20      arithmetic instruction to select elements in vector register vt for operation with corresponding elements in other vector registers through the use of a well known crossbar method. A third exemplary vector register, vd, may be used to hold the results of operations on the elements of the registers described above. Although these registers (vs, vt, and vd) are used to  
25      associate vector registers with a set of vector elements, other vector registers are equally suitable for present invention.

To perform arithmetic operations on desired elements of a vector, the present invention uses a well known crossbar method adapted to select an element of the vector register, vt, and replicate the element in all other element fields of the vector. That is, an element of vt is propagated to all 5 other elements in the vector to be used with each of the elements of the other vector operand. Alternatively, all the elements of the vector, vt, may be selected without modification. Another selection method allows an instruction to specify as an element an immediate value in the instruction opcode vector field corresponding to vt and replicate the element for all other 10 elements of vector vt. These elements thus selected are then passed onto the VU for arithmetic operation.

Figure 6 illustrates element select format for 4 16-bit elements in a 64-bit register. The exemplary vector register vt 600 is initially loaded with four 15 elements: A, B, C, and D. The present invention allows an instruction to select or specify any one of the element formats as indicated by rows 602 through 610. For example, element B for vt 600 may be selected and replicated for all 4 elements as shown in row 604. On the other hand the vt 600 may be passed without any modification as in row 610.

20

Figure 7 illustrates element select format for 8 8-bit elements in a 64-bit register. The exemplary vector register vt 700 is initially loaded with eight elements: A, B, C, D, E, F, G, and H. The present invention allows an instruction to select or specify any one of the element formats as indicated by 25 rows 702 through 718. For example, element G for vt 700 may be selected and replicated for all 8 elements as shown in row 714. On the other hand, the vt 700 may be passed without any modification as in row 718.

## ARITHMETIC INSTRUCTIONS

In accordance with the preferred embodiment of the present invention, arithmetic operations are performed on the corresponding elements of vector registers. The instruction is fetched from main memory and is loaded into a instruction register. It specifies the arithmetic operation to be performed.

In the following arithmetic instructions, the operands are values in integer vector format. The accumulator is in the corresponding accumulator vector format. The arithmetic operations are performed between elements of vectors occupying corresponding positions in the vector field in accordance with SIMD characteristics of the present invention. For example, an add operation between vs and vt actually describes eight parallel add operations between vs[0] and vt[0] to vs[7] and vt[7]. After an arithmetic operation has been performed but before the values are written into the accumulator, a wrapped arithmetic is performed such that overflows and underflows wrap around the Accumulator's representable range.

Accumulate Vector Add (ADDA(fmt)). In the present invention ADDA(fmt) instruction, the elements in vector registers vt and vs are added to those in the Accumulator. Specifically, the corresponding elements in vector registers vt and vs are added. Then, the elements of the sum are added to the corresponding elements in the accumulator. Any overflows or underflows in the elements wrap around the accumulator's representable range and then are written into the accumulator.

Figure 8 illustrates an exemplary ADDA.fmt arithmetic operation between elements of operand registers vs 800 and vt 802. Each of the registers 800, 802, and 804 contains 4 16-bit elements. Each letter in the elements (i.e., A, B, C, D, E, F, G, H, and I) stands for a binary number. FFFF is an 5 hexadecimal representation of 16-bit binary number, 1111 1111 1111 1111. The vs register 800 holds elements FFFF, A, B, and C. The selected elements of vt registers are FFFF, D, E, and F. The ADDA.fmt arithmetic instruction directs the VU to add corresponding elements: FFFF+FFFF (=1FFFD), A+D, B+E, and C+F. Each of these sums are then added to the corresponding existing 10 elements (i.e., FFFF, G, H, and I) in the accumulator 804: FFFF+1FFFD, A+D+G, B+E+H, and C+F+I. The addition of the hexadecimal numbers, 1FFFD and FFFF, produces 2FFFC, an overflow condition for a general purpose 64-bit register. The accumulator's representable range is 48 bits in accordance with the present invention. Since this is more than enough bits 15 to represent the number, the entire number 2FFFC is written into the accumulator. As a result, no bits have been lost in the addition and accumulation process.

Load Vector Add (ADDL.fmt). According to the ADDL.fmt instruction, 20 the corresponding elements in vectors vt and vs are added and then stored into corresponding elements in the accumulator. Any overflows or underflows in the elements wrap around the accumulator's representable range and then are written into the accumulator 706.

25 Figure 9 illustrates an exemplary ADDL.fmt arithmetic operation between elements of operand registers vs 900 and vt 902. Each of the registers 900, 902, and 904 contains 4 16-bit elements. Each letter in the elements (i.e.,

A, B, C, D, E, and F) stands for a binary number. FFFF is an hexadecimal representation of 16-bit binary number, 1111 1111 1111 1111. The vs register 900 holds elements FFFF, A, B, and C. The selected elements of vt registers are FFFF, D, E, and F. The ADDA(fmt arithmetic instruction instructs the VU

5 to add corresponding elements: FFFF+FFFF , A+D, B+E, and C+F. The addition of hexadecimal numbers, FFFF and FFFF, produces 1FFFD, a technical overflow condition for a general purpose 64-bit register. The present invention wraps the number 1FFFD around the accumulator's representable range, which is 48 bits. Since this is more than enough bits to

10 represent the number, the entire number 1FFFD is written into the accumulator. As a result, no bits have been lost in the addition process.

Accumulate Vector Multiply (MULA(fmt). The MULA(fmt instruction multiplies the values in vectors vt and vs. Then the product is added to the

15 accumulator. Any overflows or underflows in the elements wrap around the accumulator's representable range and then are written into the accumulator.

Add Vector Multiply to Accumulator (MULL(fmt). The MULL(fmt instruction multiplies the values in vectors vt and vs. Then, the product is

20 written to the accumulator. Any overflows or underflows in the elements wrap around the accumulator's representable range and then are written into the accumulator.

Subtract Vector Multiply from Accumulator (MULS(fmt). In

25 MULS(fmt instruction, the values in vector vt are multiplied by the values in vector vs, and the product is subtracted from the accumulator. Any

overflows or underflows in the elements wrap around the accumulator's representable range and then are written into the accumulator.

Load Negative Vector Multiply (MULSL fmt). The MULSL fmt  
5 instruction multiplies the values in vector vt with the values in vector vs. Then, the product is subtracted from the accumulator. Any overflows or underflows in the elements wrap around the accumulator's representable range and then are written into the accumulator.

10 Accumulate Vector Difference (SUBA fmt). The present SUBA fmt instruction computes the difference between vectors vt and vs. Then, it adds the difference to the value in the accumulator. Any overflows or underflows in the elements wrap around the accumulator's representable range and then are written into the accumulator.

15 Load Vector Difference (SUBL fmt). According to SUBL fmt instruction, the differences of vectors vt and vs are written into those in the accumulator. Any overflows or underflows in the elements wrap around the accumulator's representable range and then are written into the accumulator.

20  
ELEMENT TRANSFORMATION  
After an arithmetic operation, the elements in the accumulator are transformed into the precision of the elements in the destination registers for further processing or for eventual storage into a memory unit. During the  
25 transformation process, the data in each accumulator element is packed to the precision of the destination operand. The present invention provides the following instruction method for such transformation.

Scale, Round and Clamp Accumulator (Rx(fmt)). According to Rx(fmt) instruction, the values in the accumulator are shifted right by the values specified in a vector field vt in the instruction opcode. This variable shift 5 supports application or algorithm specific fixed point precision. The vt operands are values in integer vector format. The accumulator is in the corresponding accumulator vector format.

Then, each element in the accumulator is rounded according to a mode 10 specified by the instruction. The preferred embodiment of the invention allows three rounding modes: 1) round toward zero, 2) round to nearest with exactly halfway rounding away from zero, and 3) round to nearest with exactly halfway rounding to even. These rounding modes minimize truncation errors during arithmetic process.

15

The elements are then clamped to either a signed or unsigned range of an exemplary destination vector register, vd. That is, the elements are saturated to the largest representable value for overflow and the smallest representable value for underflow. Hence, the clamping limits the resultant 20 values to the minimum and maximum precision of the destination elements without overflow or underflow.

#### SAVING ACCUMULATOR STATE

Since the vector accumulator is a special register, the present invention 25 allows the contents of the accumulator to be saved in a general register. However, because the size of the elements of the accumulator is larger than the elements of general purpose registers, the transfer occurs in multiple

chunks of constituent elements. The following instructions allow storage of the accumulator state.

Read Accumulator (RAC(fmt)). The RAC(fmt) instruction reads a portion of the accumulator elements, preferably a third of the bits in elements, and saves the elements into a vector register. Specifically, this instruction method allows the least significant, middle significant, or most significant third of the bits of the accumulator elements to be assigned to a vector register such as vd. In this operation, the values extracted are not clamped. That is, the bits are simply copied into the elements of vector register, vd.

Write Accumulator High (WACH(fmt)). The WACH(fmt) instruction loads portions of the accumulator from a vector register. Specifically, this instruction method writes the most significant third of the bits of the accumulator elements from a vector register such as vs. The least significant two thirds of the bits of the accumulator are not affected by this operation.

Write Accumulator Low (WACL(fmt)). According to WACL(fmt) instruction, the present invention loads two thirds of the accumulator from two vector registers. Specifically, this instruction method writes the least significant two thirds of the bits of the accumulator elements. The remaining upper one third of the bits of the accumulator elements are written by the sign bits of the corresponding elements of a vector register such as vs, replicated by 16 or 8 times, depending on the data type format specified in the instruction.

A RACL/RACM/RACH instruction followed by WACL/WACH are used to save and restore the accumulator. This save/ restore function is format independent, either format can be used to save or restore accumulator values generated by either QH or OB operations. Data conversion need not 5 occur. The mapping between element bits of the OB format accumulator and bits of the same accumulator interpreted in QH format is implementation specific, but consistent for each implementation.

The present invention, a method for providing extended precision in 10 SIMD vector arithmetic operations, utilizes an accumulator register. While the present invention has been described in particular embodiments, it should be appreciated that the present invention should not be construed as being limited by such embodiments, but rather construed according to the claims below.

## CLAIMS

What is claimed is:

1. In a computer system including a processor which contains a first set of N-bit data elements loaded into a first register and a second set of N-bit data elements loaded into a second register, a method for providing extended precision in single instruction multiple data (SIMD) arithmetic operations, comprising the steps of:
  - fetching an arithmetic instruction from a memory unit;
  - decoding the arithmetic instruction and reading the first vector register and the second vector register;
  - executing the arithmetic instruction on corresponding N-bit data elements in the first register and second register to produce corresponding resulting elements;
  - writing the resulting elements into corresponding elements of an accumulator;
  - transforming the each resulting element in the accumulator into N-bits; and
  - writing the transformed elements of N-bit width into a third register.
2. The method as recited in Claim 1, wherein said decoding step further comprises the steps of:
  - selecting an element from the second register; and
  - copying the selected element into the other elements in the second register.

3. The method as recited in Claim 1, wherein said arithmetic instruction is an addition of corresponding vector elements in the first and second vector registers.
- 5 4. The method as recited in Claim 1, wherein said arithmetic instruction is a multiplication of corresponding vector elements in the first and second vector registers.
- 10 5. The method as recited in Claim 1, wherein said arithmetic instruction is a subtraction of second vector register elements from the first vector register elements.
- 15 6. The method as recited in Claim 1, wherein said accumulator is a register having an integer multiples of 64-bit width.
7. The method as recited in Claim 1, wherein said accumulator is a register of 192-bits.
- 20 8. The method as recited in Claim 1, wherein said transformation step further comprises the steps of:  
scaling the resulting elements in the accumulator by shifting the values in the resulting elements;  
rounding the scaled resulting elements in the accumulator; and  
clamping the rounded resulting elements.
- 25 9. The method as recited in Claim 1, wherein said third register writing step further comprises the steps of:

reading a portion of the accumulator elements; and  
writing the portion of the accumulator elements into the  
corresponding elements of said third register.

5        10.      The method as recited in Claim 9, wherein the portion is either  
the low third bits or the high third bits of the elements in the accumulator.

10        11.      The method as recited in Claim 1, wherein the values in the  
resulting elements are wrapped around the representable range of the  
accumulator elements.

12.      The method as recited in Claim 1, wherein the data elements are  
integers.

15        13.      The method as recited in Claim 1, wherein the first register, the  
second register, and the third registers are floating point registers.

14.      The method as recited in Claim 1, wherein the first register, the  
second register, and the third register are each 64-bit wide.

20        15.      The method as recited in Claim 1, wherein N is 8.

16.      The method as recited in Claim 1, wherein N is 16.

25        17.      The method as recited in Claim 15, wherein the elements in the  
accumulator are each 24 bit wide.

18. The method as recited in Claim 16, wherein the elements in the accumulator are each 48 bit wide.
19. The method as recited in Claim 1, wherein said third register writing step further comprises the steps of:
- reading a portion of the accumulator elements; and
  - writing the portion of the accumulator elements into the corresponding elements of said third register.
- 10 20. The method as recited in Claim 19, wherein the portion is chosen from the low third bits, the middle third bits, or the high third bits of the elements in the accumulator.
- 15 21. In a computer system including a processor which contains a first set of N-bit data elements loaded into a first register, a second set of N-bit data elements loaded into a second register, and an accumulator having a third set of data elements, a method for providing extended precision in single instruction multiple data (SIMD) arithmetic operations, comprising the steps of:
- 20 fetching an arithmetic instruction from a memory unit;
  - decoding the arithmetic instruction and reading the first vector register and the second vector register;
  - executing the arithmetic instruction on corresponding data elements in the first and second vector registers to produce corresponding resulting elements;
  - 25 adding the resulting elements to the corresponding elements in the accumulator;

writing the resulting elements into the accumulator;

transforming the each resulting element in the accumulator into an N-bit width element; and

writing the transformed elements of N-bit width into a third register.

5

22. The method as recited in Claim 21, wherein said decoding step further comprises the steps of:

selecting an element from the second register; and

copying the selected element into the other elements in the second

10 register.

23. The method as recited in Claim 21, wherein said arithmetic instruction is an addition of corresponding vector elements in the first and second vector registers.

15

24. The method as recited in Claim 21, wherein said arithmetic instruction is a multiplication of corresponding vector elements in the first and second vector registers.

20

25. The method as recited in Claim 21, wherein said arithmetic instruction is a subtraction of second vector register elements from the first vector register elements.

26. The method as recited in Claim 21, wherein said accumulator is  
25 a register having an integer multiples of 64-bit width.

27. The method as recited in Claim 21, wherein said accumulator is a register of 192-bits.

28. The method as recited in Claim 21, wherein said transformation 5 step further comprises the steps of:

scaling the resulting elements in the accumulator by shifting the values in the resulting elements;

rounding the scaled resulting elements in the accumulator; and

clamping the rounded resulting elements.

10

29. The method as recited in Claim 21, wherein said third register writing step further comprises the steps of:

reading a portion of the accumulator elements; and

writing the portion of the accumulator elements into the 15 corresponding elements of said third register.

30. The method as recited in Claim 29, wherein the portion is either the low third bits or high third bits of the elements in the accumulator.

20 31. The method as recited in Claim 21, wherein the values in the resulting elements are wrapped around the representable range of the accumulator elements.

32. The method as recited in Claim 21, wherein the data elements 25 are integers.

33. The method as recited in Claim 21, wherein the first register, the second register, and the third registers are floating point registers.

34. The method as recited in Claim 21, wherein the first register, the  
5 second register, and the third register are each 64-bit wide.

35. The method as recited in Claim 21, wherein N is 8.

36. The method as recited in Claim 21, wherein N is 16.

10

37. The method as recited in Claim 35, wherein the elements in the accumulator are each 24 bit wide.

15 38. The method as recited in Claim 36, wherein the elements in the accumulator are each 48 bit wide.

39. The method as recited in Claim 21, wherein said third register writing step further comprises the steps of:

20 reading a portion of the accumulator elements; and  
writing the portion of the accumulator elements into the corresponding elements of said third register.

40. The method as recited in Claim 39, wherein the portion is chosen from the low third bits, the middle third bits, or the high third bits of  
25 the elements in the accumulator.

## ABSTRACT

The present invention provides extended precision in SIMD arithmetic operations in a processor having a register file and an accumulator. A first set of data elements and a second set of data elements are loaded into a first

5 vector register and a second vector register, respectively. Each data element comprises N bits. Next, an arithmetic instruction is fetched from memory.

The arithmetic instruction is decoded. Then, a first vector register and a second vector register are read from the register file. The present invention then executes the arithmetic instruction on corresponding data elements in

10 the first and second vector registers. The result of the execution is then written into the accumulator. Then, each element in the accumulator is transformed into an N-bit width element and stored into the memory.

15



FIG. 1  
(Prior Art)

COMPUTER SYSTEM 212



FIG. 2

To Main Memory

300

Vector Load/Store Unit 318



314

316

SIMD Vector Unit  
302

324

Accumulator 312



63      vs 306      vt 308      vd 310

0      0      0      0

191      0

0      0

Register File 304

326

**FIG. 3**

FIG. 4





**FIG. 5**

QH DATA TYPE ELEMENT SELECT FORMAT

$\sqrt{t}:$  63 48 47 32 31 16 15 0 ~600

|   |   |   |   |      |
|---|---|---|---|------|
| A | A | A | A | ~602 |
| B | B | B | B | ~604 |
| C | C | C | C | ~606 |
| D | D | D | D | ~608 |
| D | C | B | A | ~610 |

FIG. 6

## OB DATA TYPE ELEMENT SELECT FORMAT

|   |   |   |   |   |   |   |   |            |
|---|---|---|---|---|---|---|---|------------|
| A | A | A | A | A | A | A | A | $\sim 702$ |
| B | B | B | B | B | B | B | B | $\sim 704$ |
| C | C | C | C | C | C | C | C | $\sim 706$ |
| D | D | D | D | D | D | D | D | $\sim 708$ |
| E | E | E | E | E | E | E | E | $\sim 710$ |
| F | F | F | F | F | F | F | F | $\sim 712$ |
| G | G | G | G | G | G | G | G | $\sim 714$ |
| H | H | H | H | H | H | H | H | $\sim 716$ |
| H | G | F | E | D | C | B | A | $\sim 718$ |

FIG. 7



FIG. 8

FIG. 9



## **EXHIBIT E**



IN THE UNITED STATES PATENT AND TRADEMARK OFFICE

In re application of:

Hsu *et al.*

Appl. No.: 09/223,046

Filed: December 30, 1998

For: **Method for Providing Extended Precision in SIMD Vector Arithmetic Operations**

Art Unit: 2783

Examiner: L. Donaghue

Atty. Docket: 0056.10US

**Amendment and Reply under 37 C.F.R. § 1.111**

Commissioner for Patents  
Washington, D.C. 20231

Sir:

In reply to the Office Action dated February 11, 2000, (PTO Prosecution File Wrapper Paper No. 15), Applicants submit the following Amendment and Remarks.

It is not believed that extensions of time or fees for net addition of claims are required beyond those that may otherwise be provided for in documents accompanying this paper. However, if additional extensions of time are necessary to prevent abandonment of this application, then such extensions of time are hereby petitioned under 37 C.F.R. § 1.136(a), and any fees required therefor (including fees for net addition of claims) are hereby authorized to be charged to our Deposit Account No. 19-0036.

*Amendments*

*In the Claims:*

Please cancel claim 1 without prejudice to or disclaimer of the subject matter contained therein.

Please add the following new claims 41-78:

- 1      --41. A computer-based method for providing extended precision in single instruction multiple data (SIMD) arithmetic operations, comprising the steps of:
- 2

- 3                   (a)     loading a first vector into a first register, said first vector comprising a  
4     plurality of N-bit elements;
- 5                   (b)     loading a second vector into a second register, said second vector  
6     comprising a plurality of N-bit elements;
- 7                   (c)     executing an arithmetic instruction for at least one pair consisting of an  
8     N-bit element in said first register and an N-bit element in said second register, to produce a  
9     resulting element;
- 10                  (d)     writing said resulting element into an M-bit element of an accumulator,  
11     wherein M is greater than N;
- 12                  (e)     transforming said resulting element in said accumulator into a width of  
13     N-bits; and
- 14                  (f)     writing said resulting element into a third register.

1       42.   The method as recited in claim 41, wherein said accumulator comprises a plurality of M-  
2     bit elements and wherein steps (c)-(f) operate on a plurality of elements of said first and second  
3     vectors to produce a resultant vector formed from a plurality of resulting elements written to said  
4     third register.

1       43.   The method as recited in claim 42, further comprising a step before step (c) of:  
2              selecting an element from said second register; and  
3              copying said element into all other elements in said second register.

1       44.   The method as recited in claim 42, further comprising a step before step (f) of:  
2              selecting a subset of said resulting elements in said accumulator for writing to said  
3     third register, said subset being chosen from any one of: the low third bits, the middle third bits,  
4     and the high third bits of said resulting elements in said accumulator.

1       45.   The method as recited in claim 42, wherein M is equal to three times N.

1       46.   The method as recited in claim 45, wherein N is equal to eight or sixteen.

1       47. The method as recited in claim 42, wherein said resulting elements in said accumulator  
2       are wrapped around the representable range of said resulting elements.

1       48. The method as recited in claim 42, further comprising a step before step (f) of:  
2               dividing said resulting elements stored in said accumulator into a plurality of  
3       subsets;  
4               writing each subset to at least one of a plurality of registers, each of said plurality  
5       of registers having a width smaller than said accumulator width.

1       49. The method as recited in claim 41, wherein said loading step (a) and said loading step (b)  
2       are not formatted.

1       50. The method as recited in claim 41, further comprising a step before step (d) of:  
2               formatting said resulting element as specified in said arithmetic instruction.

1       51. The method as recited in claim 41, wherein said arithmetic instruction is any one of:  
2       addition, multiplication and subtraction.

1       52. The method as recited in claim 41, wherein step (e) comprises the steps of:  
2               shifting said resulting element in said accumulator for scaling the value of said  
3       resulting element;  
4               rounding said resulting element; and  
5               clamping said resulting element.

1       53. The method as recited in claim 52, wherein said rounding step comprises one of:  
2               rounding said resulting element towards zero;  
3               rounding said resulting element towards the nearest unit, wherein said resulting  
4       element is rounded away from zero if said resulting element is at least halfway towards the  
5       nearest unit; and  
6               rounding said resulting element towards the nearest unit, wherein said resulting  
7       element is rounded towards zero if said resulting element is at least halfway towards the nearest  
8       unit.

1       55.     The method as recited in claim 41, wherein N is any one of: eight, sixteen, thirty-two and  
2       sixty-four.

1 56. The method as recited in claim 55, wherein said N-bit elements are integers.

1       57.     The method as recited in claim 55, wherein each of said first and second vectors has a  
2       width of 64 bits.

1       59.     The method as recited in claim 58, wherein said accumulator is a register having a width  
2       of 192 bits.

1       60.     The method as recited in claim 41, wherein said first register, said second register, and  
2            said third register are floating point registers.

1       61. The method as recited in claim 41, wherein said first register, said second register, and  
2       said third register each have a width of 64-bits

1       62. A processor for providing extended precision in single instruction multiple data (SIMD)  
2       arithmetic operations, comprising:

3 means for executing an arithmetic instruction involving an element of a first  
4 vector and an element of a second vector to produce a resulting element, said first and second  
5 vector comprising a plurality of N-bit elements;

6 an accumulator for receiving said resulting element, wherein said resulting  
7 element is stored in an M-bit element of said accumulator and wherein M is greater than N;

8               means for transforming said resulting element in said accumulator into a width  
9       of N-bits; and

10              means for writing said transformed resulting element to a register.

1       63.     The processor as recited in claim 62, wherein said accumulator comprises a plurality of  
2       M-bit elements and wherein said means for executing is repeated for said plurality of elements  
3       of said first and second vectors to produce a plurality of resulting elements that are received by  
4       said accumulator and wherein said means for transforming and said means for writing are  
5       performed on said plurality of resulting elements.

1       64.     The processor as recited in claim 63, wherein means for writing comprises:  
2               selecting a subset of said resulting elements in said accumulator for writing to said  
3       register, said subset being chosen from any one of: the low third bits, the middle third bits, and  
4       the high third bits of said resulting elements in said accumulator.

1       65.     The processor as recited in claim 63, wherein M is equal to three times N.

1       66.     The processor as recited in claim 65, wherein N is equal to eight or sixteen.

1       67.     The system as recited in claim 63, wherein said resulting elements in said accumulator  
2       are wrapped around the representable range of said resulting elements.

1       68.     The system as recited in claim 63, further comprising:  
2               dividing said resulting elements stored in said accumulator into a plurality of  
3       subsets;  
4               writing each subset to at least one of a plurality of registers, each of said plurality  
5       of registers having a width smaller than said accumulator width.

1       69.     The system as recited in claim 62, further comprising:  
2               means for formatting said resulting element in said accumulator as specified in  
3       said arithmetic instruction.

- 1        70.      The processor as recited in claim 62, wherein said arithmetic instruction is any one of:  
2                    addition, multiplication and subtraction.
  
- 1        71.      The processor as recited in claim 62, wherein means for transforming comprises:  
2                    means for shifting said resulting element in said accumulator for scaling the value  
3                    of said resulting element;  
4                    means for rounding said resulting element; and  
5                    means for clamping said resulting element.
  
- 1        72.      The processor as recited in claim 71, wherein said rounding means comprises one of:  
2                    means for rounding said resulting element towards zero;  
3                    means for rounding said resulting element towards the nearest unit, wherein said  
4                    resulting element is rounded away from zero if said resulting element is at least halfway towards  
5                    the nearest unit; and  
6                    means for rounding said resulting element towards the nearest unit, wherein said  
7                    resulting element is rounded towards zero if said resulting element is at least halfway towards  
8                    the nearest unit.
  
- 1        73.      The processor as recited in claim 62, further comprising:  
2                    means for adding an element previously stored in said accumulator to said  
3                    resulting element, upon reception of said resulting element by said accumulator.
  
- 1        74.      The processor as recited in claim 62, wherein N is any one of: eight, sixteen, thirty-two  
2                    and sixty-four.
  
- 1        75.      The processor as recited in claim 74, wherein said N-bit elements are integers
  
- 1        76.      The processor as recited in claim 74, wherein each of said first and said second vectors  
2                    has a width of 64 bits.
  
- 1        77.      The processor as recited in claim 76, wherein said accumulator is a register having a  
2                    width equal to an integer multiple of 64 bits.

1        78.     The processor as recited in claim 77, wherein said accumulator is a register having a  
2        width of 192 bits.--

***Remarks***

Upon entry of the foregoing amendment, claims 41-78 are pending in the application. This amendment seeks to cancel claim 1 and add new claims 41-78. These changes are believed to introduce no new matter, and their entry is respectfully requested.

The Examiner has rejected claim 1 under 35 U.S.C. § 101 for double patenting in view of U.S. Patent No. 5,864,703. Applicants have canceled claim 1. Thus, this rejection is now moot. By the foregoing, Applicants seek to add new claims 41-78. Favorable consideration and allowance of these new claims is respectfully solicited.

The Examiner is invited to telephone the undersigned representative if he believes that an interview might be useful for any reason.

Respectfully submitted,

STERNE, KESSLER, GOLDSTEIN & FOX P.L.L.C.



Michael B. Ray

Attorney for Applicants  
Registration No. 33,997

Date: 8/11/00  
1100 New York Ave, N.W., Suite 600  
Washington, DC 20005  
(202) 371-2600

MBR/MPT/mmb/agj  
P:\USERS\MTERRY\1778\011.0001\1778.0110001 amendment2.wpd

## **EXHIBIT F**



IN THE UNITED STATES PATENT AND TRADEMARK OFFICE

In re application of:

Hsu *et al.*

Appl. No. 09/223,046

Filed: December 30, 1998

For: **Method for Providing Extended  
Precision In SIMD Vector  
Arithmetic Operations**

Art Unit: 2154

Examiner: Donaghue, L.

Atty. Docket: 0056.10US

**Batch No. R88**

**Amendment Under 37 C.F.R. § 1.312**

*Attn: Box Issue Fee*

Commissioner for Patents  
Washington, D.C. 20231

Sir:

Submitted herein is an Amendment Under 37 C.F.R. § 1.312. As payment of the issue fee has not yet been made or is filed herewith, Applicants respectfully submit that filing under paragraph (a) of 37 C.F.R. § 1.312 is proper. (M.P.E.P. § 714.16.)

It is believed that extensions of time are not required beyond those that may otherwise be provided for in documents accompanying this Amendment. However, if additional extensions of time are necessary to prevent abandonment of this application, then such extensions of time are hereby petitioned under 37 C.F.R. § 1.136(a), and any fees required therefor are hereby authorized to be charged to our Deposit Account No. 19-0036.

**If the Examiner believes, for any reason, that personal communication will the expedite acceptance of this Amendment, the Examiner is invited to telephone the undersigned at the number provided.**

***Amendment***

Please enter the following Amendment:

***In the Drawings:***

Please amend FIG. 2 as shown in red.

***Remarks***

Applicants have noticed that within computer system 212 of FIG. 2, an element number "204" is erroneously present. Accordingly, Applicants are now submitting a new FIG. 2 without the incorrect element number. FIG. 2 is described in the Brief Description of the Drawings on page 6 of the specification and in the Detailed Description of Preferred Embodiments on pages 7 and 8 of the specification. Applicants assert that the corrected FIG. 2 does not constitute new matter because this drawing is clearly consistent with the description of FIG. 2 in the application as originally filed. Entry of the above amendment is respectfully requested.

Applicants have concurrently submitted a *Request to Approve Proposed Drawing Corrections* and one sheet of drawings containing the proposed correction to original FIG. 2, shown in red ink. The proposed changes add no new matter to this application.

Respectfully submitted,

STERNE, KESSLER, GOLDSTEIN & FOX P.L.L.C.

  
Michael B. Ray 36,013  
Attorney for Applicants  
Registration No. 33,997

Date: 12/5/00

1100 New York Avenue, N.W., Suite 600  
Washington, D.C. 20005-3934  
(202) 371-2600  
MBR/MPT/agj  
P:\USERS\MSM\TERRY\1778\011.0001\P99-59.wpd  
SKGF Rev. 5/30/00



IN THE UNITED STATES PATENT AND TRADEMARK OFFICE

In re application of:

Hsu *et al.*

Appl. No. 09/223,046

Filed: December 30, 1998

For: **Method for Providing Extended  
Precision In SIMD Vector  
Arithmetic Operations**

Art Unit: 2154

Examiner: Donaghue, L.

Atty. Docket: 0056.10US

**Batch No. R88**

**Request to Approve Proposed Drawing Corrections**

Commissioner for Patents  
Washington, D.C. 20231

Sir:

Attached is a copy of 1 drawing sheet, containing a proposed correction to Figure 2, shown in red. The proposed change adds no new matter to this application. Applicants requests that the Examiner approve the proposed correction. Also submitted herewith are Formal Drawings, which correspond to the changes noted on the attached request.

Respectfully submitted,

STERNE, KESSLER, GOLDSTEIN & FOX P.L.L.C.

  
Michael B. Ray  
Attorney for Applicants  
Registration No. 33,997

Date: 12/5/00  
1100 New York Avenue, N.W., Suite 600  
Washington, D.C. 20005-3934  
(202) 371-2600

COMPUTER SYSTEM 212



FIG. 2



## **EXHIBIT G**



IN THE UNITED STATES PATENT AND TRADEMARK OFFICE

In re application of:

Timothy van Hook et al.

Appl. No. 09/233,046

Filed: December 30, 1998

For: **Methods for Providing Extended Precision in SIMD Vector Arithmetic Operations**

Confirmation No.: 2296

Art Unit: 2154

Examiner: L. Donaghue

Atty. Docket: 0056.10US

**Amendment and Reply under 37 C.F.R. § 1.111**

Commissioner for Patents  
Washington, D.C. 20231

Sir:

In reply to the Office Action dated March 15, 2002, (PTO Prosecution File Wrapper Paper No. 39), Applicants submit the following Amendment and Remarks. This Amendment is provided in the following format:

- (A) A clean version of each replacement paragraph/section/claim along with clear instructions for entry;
- (B) Starting on a separate page, appropriate remarks and arguments. 37 C.F.R. § 1.121 and MPEP 714; and
- (C) Starting on a separate page, a marked-up version entitled: "Version with markings to show changes made."

It is not believed that extensions of time or fees for net addition of claims are required beyond those that may otherwise be provided for in documents accompanying this paper. However, if additional extensions of time are necessary to prevent abandonment of this application, then such extensions of time are hereby petitioned under 37 C.F.R. § 1.136(a), and any fees required therefor (including fees for net addition of claims) are hereby authorized to be charged to our Deposit Account No. 19-0036.

***Amendments***

***In the Claims:***

Please cancel claims 41-46, 48-52, 54-66, 68-71 and 73-78 without prejudice or disclaimer.

Please substitute the following claims 47 and 67 for the pending claims 47 and 67:

47. (Once Amended) A computer-based method for providing extended precision in single instruction multiple data (SIMD) arithmetic operations, comprising the steps of:

- (a) loading a first vector into a first register, said first vector comprising a plurality of N-bit elements;
- (b) loading a second vector into a second register, said second vector comprising a plurality of N-bit elements;
- (c) executing an arithmetic instruction for at least one pair consisting of an N-bit element in said first register and an N-bit element in said second register, to produce a resulting element;
- (d) writing said resulting element into an M-bit element of an accumulator, wherein M is greater than N;
- (e) transforming said resulting element in said accumulator into a width of N-bits; and
- (f) writing said resulting element into a third register;  
wherein said accumulator comprises a plurality of M-bit elements and wherein steps (c)-(f) operate on a plurality of elements of said first and second vectors to produce a resultant vector formed from a plurality of resulting elements written to said third register; and  
wherein said resulting elements in said accumulator are wrapped around the representable range of said resulting elements.

53. (Once Amended) The method as recited in claim 80, wherein said rounding step comprises one of:

rounding said resulting element towards zero;

rounding said resulting element towards the nearest unit, wherein said resulting element is rounded away from zero if said resulting element is at least halfway towards the nearest unit; and

rounding said resulting element towards the nearest unit, wherein said resulting element is rounded towards zero if said resulting element is at least halfway towards the nearest unit.

67. (Once Amended) A processor for providing extended precision in single instruction multiple data (SIMD) arithmetic operations, comprising:

means for executing an arithmetic instruction involving an element of a first vector and an element of a second vector to produce a resulting element, said first and second vector comprising a plurality of N-bit elements;

an accumulator for receiving said resulting element, wherein said resulting element is stored in an M-bit element of said accumulator and wherein M is greater than N;

means for transforming said resulting element in said accumulator into a width of N-bits; and

means for writing said transformed resulting element to a register;

wherein said accumulator comprises a plurality of M-bit elements and wherein said means for executing is repeated for said plurality of elements of said first and second vectors to produce a plurality of resulting elements that are received by said accumulator and wherein said means for transforming and said means for writing are performed on said plurality of resulting elements; and

wherein said resulting elements in said accumulator are wrapped around the representable range of said resulting elements.

72. The processor as recited in claim 82, wherein said rounding means comprises one of:

means for rounding said resulting element towards zero;

means for rounding said resulting element towards the nearest unit, wherein said resulting element is rounded away from zero if said resulting element is at least halfway towards the nearest unit; and

means for rounding said resulting element towards the nearest unit, wherein said resulting element is rounded towards zero if said resulting element is at least halfway towards the nearest unit.

Please add the following new claims 79-82:

79. A computer-based method for providing extended precision in single instruction multiple data (SIMD) arithmetic operations, comprising the steps of:

- (a) loading a first vector into a first register, said first vector comprising a plurality of N-bit elements;
  - (b) loading a second vector into a second register, said second vector comprising a plurality of N-bit elements;
  - (c) executing an arithmetic instruction for at least one pair consisting of an N-bit element in said first register and an N-bit element in said second register, to produce a resulting element;
  - (d) writing said resulting element into an M-bit element of an accumulator, wherein M is greater than N;
  - (e) transforming said resulting element in said accumulator into a width of N-bits;
  - (f) dividing said resulting elements stored in said accumulator into a plurality of subsets;
  - (g) writing each subset to at least one of a plurality of registers, each of said plurality of registers having a width smaller than said accumulator width; and
  - (h) writing said resulting element into a third register;
- wherein said accumulator comprises a plurality of M-bit elements and wherein steps (c)-(h) operate on a plurality of elements of said first and second vectors to produce a resultant vector formed from a plurality of resulting elements written to said third register.

80. A computer-based method for providing extended precision in single instruction multiple data (SIMD) arithmetic operations, comprising the steps of:

- (a) loading a first vector into a first register, said first vector comprising a plurality of N-bit elements;
- (b) loading a second vector into a second register, said second vector comprising a plurality of N-bit elements;
- (c) executing an arithmetic instruction for at least one pair consisting of an N-bit element in said first register and an N-bit element in said second register, to produce a resulting element;
- (d) writing said resulting element into an M-bit element of an accumulator, wherein M is greater than N;
- (e) transforming said resulting element in said accumulator into a width of N-bits, wherein said transforming comprises shifting said resulting element in said accumulator for scaling the value of said resulting element, rounding said resulting element and clamping said resulting element; and
- (f) writing said resulting element into a third register.

81. A processor for providing extended precision in single instruction multiple data (SIMD) arithmetic operations, comprising:

means for executing an arithmetic instruction involving a first plurality of elements of a first vector and a second plurality of elements of a second vector to produce a plurality of resulting elements, said first and second vector comprising a plurality of N-bit elements;

an accumulator for receiving said plurality of resulting elements, wherein said plurality of resulting elements are each stored in one of a plurality of M-bit elements of said accumulator and wherein M is greater than N;

means for transforming said plurality of resulting elements in said accumulator into a width of N-bits;

means for dividing said plurality of resulting elements stored in said accumulator into a plurality of subsets; and

means for writing each subset to at least one of a plurality of registers, each of said plurality of registers having a width smaller than said accumulator width.

82. A processor for providing extended precision in single instruction multiple data (SIMD) arithmetic operations, comprising:

means for executing an arithmetic instruction involving an element of a first vector and an element of a second vector to produce a resulting element, said first and second vector comprising a plurality of N-bit elements;

an accumulator for receiving said resulting element, wherein said resulting element is stored in an M-bit element of said accumulator and wherein M is greater than N;

means for transforming said resulting element in said accumulator into a width of N-bits, wherein said means for transforming comprises means for shifting said resulting element in said accumulator for scaling the value of said resulting element, means for rounding said resulting element, and means for clamping said resulting element; and

means for writing said transformed resulting element to a register.

***Remarks***

Reconsideration of this application is respectfully requested.

Upon entry of the foregoing amendment, claims 47, 53, 67, 72 and 79-82 are pending in the application, with 47, 67 and 79-82 being the independent claims. Claims 41-46, 48-52, 54-66, 68-71 and 73-78 are sought to be cancelled without prejudice to or disclaimer of the subject matter therein. Claims 47, 53, 67 and 72 have been amended. New independent claims 79-82 have been added to replace allowable dependent claims 48, 52, 68 and 71, respectively, including the features of their respective base claims. These changes are believed to introduce no new matter, and their entry is respectfully requested.

Based on the above amendment and the following remarks, Applicants respectfully request that the Examiner reconsider all outstanding objections and rejections and that they be withdrawn.

***Examiner Interview***

Applicants and Applicants' representative wish to thank Examiner Donaghue for conducting the personal interview with Applicants' undersigned representative on January 24, 2002. The Examiner Interview Summary Record accurately reflects the substance of the interview.

***Rejections and Amendments***

In the Office Action dated March 15, 2002, claims 47, 48, 52, 53, 67, 68, 71 and 72 were "objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims." Accordingly, by way of the above Amendments, Applicants have placed these claims in independent form. To expedite issuance of the allowed claims, the rejected claims are being cancelled without prejudice or disclaimer. Applicants reserve the right and hereby give notice of their intent to pursue those claims and traverse the rejection in a continuation application, which will be filed in due course.

In conclusion, Applicants respectfully request that the allowed claims be passed to issue and that the rejections be withdrawn as moot in light of the cancellation of the rejected claims.

*Conclusion*

All of the stated grounds of objection and rejection have been properly traversed, accommodated, or rendered moot. Applicants therefore respectfully request that the Examiner reconsider all presently outstanding objections and rejections and that they be withdrawn. Applicants believe that a full and complete reply has been made to the outstanding Office Action and, as such, the present application is in condition for allowance. If the Examiner believes, for any reason, that personal communication will expedite prosecution of this application, the Examiner is invited to telephone the undersigned at the number provided.

Prompt and favorable consideration of this Amendment and Reply is respectfully requested.

Respectfully submitted,

STERNE, KESSLER, GOLDSTEIN & FOX P.L.L.C.



Donald J. Featherstone  
Attorney for Applicants  
Registration No. 33,876

Date: 6/14/02

1100 New York Avenue, N.W.  
Suite 600  
Washington, D.C. 20005-3934  
(202) 371-2600

DJF/mmb  
SKGF\_DC1:21053.5

SKGF Rev. 4/9/02

**Version with markings to show changes made**

Claims 41-46, 48-52, 54-66, 68-71 and 73-78 have been cancelled.

47. (Once Amended) [The method as recited in claim 42,] A computer-based method for providing extended precision in single instruction multiple data (SIMD) arithmetic operations, comprising the steps of:

- (a) loading a first vector into a first register, said first vector comprising a plurality of N-bit elements;
- (b) loading a second vector into a second register, said second vector comprising a plurality of N-bit elements;
- (c) executing an arithmetic instruction for at least one pair consisting of an N-bit element in said first register and an N-bit element in said second register, to produce a resulting element;
- (d) writing said resulting element into an M-bit element of an accumulator, wherein M is greater than N;
- (e) transforming said resulting element in said accumulator into a width of N-bits; and
- (f) writing said resulting element into a third register;  
wherein said accumulator comprises a plurality of M-bit elements and wherein steps (c)-(f) operate on a plurality of elements of said first and second vectors to produce a resultant vector formed from a plurality of resulting elements written to said third register; and

wherein said resulting elements in said accumulator are wrapped around the representable range of said resulting elements.

53. (Once Amended) The method as recited in claim [52] 80, wherein said rounding step comprises one of:

- rounding said resulting element towards zero;

rounding said resulting element towards the nearest unit, wherein said resulting element is rounded away from zero if said resulting element is at least halfway towards the nearest unit; and

rounding said resulting element towards the nearest unit, wherein said resulting element is rounded towards zero if said resulting element is at least halfway towards the nearest unit.

67. (Once Amended) [The system as recited in claim 63,] A processor for providing extended precision in single instruction multiple data (SIMD) arithmetic operations, comprising:

means for executing an arithmetic instruction involving an element of a first vector and an element of a second vector to produce a resulting element, said first and second vector comprising a plurality of N-bit elements;

an accumulator for receiving said resulting element, wherein said resulting element is stored in an M-bit element of said accumulator and wherein M is greater than N;

means for transforming said resulting element in said accumulator into a width of N-bits; and

means for writing said transformed resulting element to a register;

wherein said accumulator comprises a plurality of M-bit elements and wherein said means for executing is repeated for said plurality of elements of said first and second vectors to produce a plurality of resulting elements that are received by said accumulator and wherein said means for transforming and said means for writing are performed on said plurality of resulting elements; and

wherein said resulting elements in said accumulator are wrapped around the representable range of said resulting elements.

72. (Once Amended) The processor as recited in claim [71] 82, wherein said rounding means comprises one of:

means for rounding said resulting element towards zero;

means for rounding said resulting element towards the nearest unit, wherein said resulting element is rounded away from zero if said resulting element is at least halfway towards the nearest unit; and

means for rounding said resulting element towards the nearest unit, wherein said resulting element is rounded towards zero if said resulting element is at least halfway towards the nearest unit.

New claims 79-82 have been added.

## **EXHIBIT H**



IN THE UNITED STATES PATENT AND TRADEMARK OFFICE

|                                                                                          |                              |
|------------------------------------------------------------------------------------------|------------------------------|
| In re application of:                                                                    | Confirmation No.: 2296       |
| Van Hook <i>et al.</i>                                                                   | Art Unit: 2154               |
| Appl. No.: 09/223,046                                                                    | Examiner: Donaghue, Larry D. |
| Filed: December 30, 1998                                                                 | Atty. Docket: 1778.0110001   |
| For: <b>Method for Providing Extended Precision in SIMD Vector Arithmetic Operations</b> |                              |

**Amendment Under 37 C.F.R. § 1.114**

Commissioner for Patents  
P.O. Box 1450  
Alexandria, VA 22313-1450

Sir:

Following the filing of a Request for Continued Examiner under 37 C.F.R. § 1.114 on December 30, 2003, Applicants submit herewith the following Amendment and Remarks. This Amendment is provided in the following format:

- (A) Each section begins on a separate sheet;
- (B) Starting on a separate sheet, amendments to the specification by presenting replacement paragraphs marked up to show changes made;
- (C) Starting on a separate sheet, a complete listing of all of the claims:
  - in ascending order;
  - with status identifiers; and
  - with markings in the currently amended claims;
- (D) Starting on a separate sheet, the Remarks.

It is not believed that extensions of time or fees for net addition of claims are required beyond those that may otherwise be provided for in documents accompanying this paper. However, if additional extensions of time are necessary to prevent abandonment of this

Van Hook *et al.*  
Appl. No. 09/223,046

application, then such extensions of time are hereby petitioned under 37 C.F.R. § 1.136(a), and any fees required therefore (including fees for net addition of claims) are hereby authorized to be charged to our Deposit Account No. 19-0036.

***Amendments to the Specification***

Please amend the specification as indicated.

Please amend the paragraph starting on page 7, line 22, as follows:

FIG. 2 illustrates an exemplary computer system 212 comprised of a system bus 200 for communicating information, one or more central processors 201 coupled with the bus 200 for processing information and instructions, a computer readable volatile memory unit 202 (e.g., random access memory, static RAM, dynamic RAM, etc.) coupled with the bus 200 for storing information and instructions for the central processor(s) 201, a computer readable non-volatile memory unit 203 (e.g., read only memory, programmable ROM, flash memory, EPROM, EEPROM, etc.) coupled with the bus 200 for storing static information and instructions for the processor(s).

Please amend the paragraph starting on page 9, line 11, as follows:

The vector register file is comprised of 32 64-bit general purpose registers 306 through 310. The general purpose registers 306 through 310 are visible to the programmer and can be used to store intermediate results. The preferred embodiment of the present invention uses the floating point registers (FGR) (FPR) of a floating point unit (FPU) as its vector registers.

Please amend the paragraph starting on page 13, line 26, as follows:

Integer vector operations that write to the FPRs clamp the values being written to the target's representable range. That is, the elements are saturated for overflows and underflows underflows. For overflows, the values are clamped to the largest representable value. For underflows, the values are clamped to the smallest representable value.

Please amend the paragraph starting on page 17, line 19, as follows:

Load Vector Add (ADDL fmt). According to the ADDL fmt instruction, the corresponding elements in vectors vt and vs are added and then stored into corresponding elements in the accumulator. Any overflows or underflows in the elements wrap around the accumulator's representable range and then are written into the accumulator 706 806.

Please amend the paragraph starting on page 22, line 1, as follows:

A RACL/RACM/RACH instruction followed by WACL/WACH are used to save and restore the accumulator. This save/restore save/restore function is format independent, either format can be used to save or restore accumulator values generated by either QH or OB operations. Data conversion need not occur. The mapping between element bits of the OB format accumulator and bits of the same accumulator interpreted in QH format is implementation specific, but consistent for each implementation.

Van Hook *et al.*  
Appl. No. 09/223,046

***Amendments to the Drawings***

Submitted herewith is a replacement drawing sheet for Figure 2, corresponding to the annotated drawing sheet also submitted herewith. Specifically, element reference numeral 200 was added to Figure 2 to be consistent with the specification as originally filed.

***Amendments to the Claims***

Applicants submit no amendments to the claims.

***Remarks***

Claims 47, 53, 67, 72, and 79-82 are pending in the application, with claims 47, 67, and 79-82 being the independent claims.

This Amendment corrects formal matters without changing the scope of the claims. Specifically, drawing reference numerals were added or corrected in the specification that correspond to the drawings as originally filed, and amendments were made to correct minor informalities throughout the specification. In addition, Figure 2 was amended to include reference numeral 200 to be consistent with the specification as originally filed. Both an annotated drawing sheet and a replacement drawing sheet for Figure 2 are submitted herewith. None of the amendments add new matter. Accordingly, Applicants respectfully request that this Amendment be entered. Prompt and favorable consideration of this Amendment is respectfully requested.

Respectfully submitted,

STERNE, KESSLER, GOLDSTEIN & FOX P.L.L.C.



Donald J. Featherstone  
Attorney for Applicants  
Registration No. 33,876

Date: 10/13/04

1100 New York Avenue, N.W.  
Washington, D.C. 20005-3934  
(202) 371-2600

321264\_1.DOC



Replacement Drawing Sheet 1 of 1

Appl. No. 09/223,046; Filed: December 30, 1998  
Dkt No. 1778.0110001; Group Unit: 2154  
Inventors: Van Hook *et al.*; Tel. No.: 202-371-2600  
For: Method for Providing Extended Precision in  
SIMD Vector Arithmetic Operations



FIG. 2

Appl. No. 09/223,046; Filed: December 30, 1998  
Dkt No. 1778.0110001; Group Unit: 2154  
Inventors: Van Hook *et al.*; Tel. No.: 202-371-2600  
For: Method for Providing Extended Precision in  
SIMD Vector Arithmetic Operations



FIG.2

## **EXHIBIT I**



*Pending Claims*  
**U.S. Patent Application No. 09/223,046**

47. A computer-based method for providing extended precision in single instruction multiple data (SIMD) arithmetic operations, comprising the steps of:

- (a) loading a first vector into a first register, said first vector comprising a plurality of N-bit elements;
  - (b) loading a second vector into a second register, said second vector comprising a plurality of N-bit elements;
  - (c) executing an arithmetic instruction for at least one pair consisting of an N-bit element in said first register and an N-bit element in said second register, to produce a resulting element;
  - (d) writing said resulting element into an M-bit element of an accumulator, wherein M is greater than N;
  - (e) transforming said resulting element in said accumulator into a width of N-bits; and
  - (f) writing said resulting element into a third register;
- wherein said accumulator comprises a plurality of M-bit elements and wherein steps (c)-(f) operate on a plurality of elements of said first and second vectors to produce a resultant vector formed from a plurality of resulting elements written to said third register; and
- wherein said resulting elements in said accumulator are wrapped around the representable range of said resulting elements.

53. The method as recited in claim 80, wherein said rounding step comprises one of:  
rounding said resulting element towards zero;

rounding said resulting element towards the nearest unit, wherein said resulting element is rounded away from zero if said resulting element is at least halfway towards the nearest unit; and

rounding said resulting element towards the nearest unit, wherein said resulting element is rounded towards zero if said resulting element is at least halfway towards the nearest unit.

67. A processor for providing extended precision in single instruction multiple data (SIMD) arithmetic operations, comprising:

means for executing an arithmetic instruction involving an element of a first vector and an element of a second vector to produce a resulting element, said first and second vector comprising a plurality of N-bit elements;

an accumulator for receiving said resulting element, wherein said resulting element is stored in an M-bit element of said accumulator and wherein M is greater than N;

means for transforming said resulting element in said accumulator into a width of N-bits; and

means for writing said transformed resulting element to a register;  
wherein said accumulator comprises a plurality of M-bit elements and wherein said means for executing is repeated for said plurality of elements of said first and second vectors to produce a plurality of resulting elements that are received by said accumulator and wherein said means for transforming and said means for writing are performed on said plurality of resulting elements; and

wherein said resulting elements in said accumulator are wrapped around the representable range of said resulting elements.

72. The processor as recited in claim 82, wherein said rounding means comprises one of:  
means for rounding said resulting element towards zero;  
means for rounding said resulting element towards the nearest unit, wherein said resulting element is rounded away from zero if said resulting element is at least halfway towards the nearest unit; and  
means for rounding said resulting element towards the nearest unit, wherein said resulting element is rounded towards zero if said resulting element is at least halfway towards the nearest unit.

79. A computer-based method for providing extended precision in single instruction multiple data (SIMD) arithmetic operations, comprising the steps of:

- (a) loading a first vector into a first register, said first vector comprising a plurality of N-bit elements;
- (b) loading a second vector into a second register, said second vector comprising a plurality of N-bit elements;
- (c) executing an arithmetic instruction for at least one pair consisting of an N-bit element in said first register and an N-bit element in said second register, to produce a resulting element;
- (d) writing said resulting element into an M-bit element of an accumulator, wherein M is greater than N;
- (e) transforming said resulting element in said accumulator into a width of N-bits;
- (f) dividing said resulting elements stored in said accumulator into a plurality of subsets;

(g) writing each subset to at least one of a plurality of registers, each of said plurality of registers having a width smaller than said accumulator width; and

(h) writing said resulting element into a third register;

wherein said accumulator comprises a plurality of M-bit elements and wherein steps (c)-(h) operate on a plurality of elements of said first and second vectors to produce a resultant vector formed from a plurality of resulting elements written to said third register.

80. A computer-based method for providing extended precision in single instruction multiple data (SIMD) arithmetic operations, comprising the steps of:

- (a) loading a first vector into a first register, said first vector comprising a plurality of N-bit elements;
- (b) loading a second vector into a second register, said second vector comprising a plurality of N-bit elements;
- (c) executing an arithmetic instruction for at least one pair consisting of an N-bit element in said first register and an N-bit element in said second register, to produce a resulting element;
- (d) writing said resulting element into an M-bit element of an accumulator, wherein M is greater than N;
- (e) transforming said resulting element in said accumulator into a width of N-bits, wherein said transforming comprises shifting said resulting element in said accumulator for scaling the value of said resulting element, rounding said resulting element and clamping said resulting element; and
- (f) writing said resulting element into a third register.

81. A processor for providing extended precision in single instruction multiple data (SIMD) arithmetic operations, comprising:

means for executing an arithmetic instruction involving a first plurality of elements of a first vector and a second plurality of elements of a second vector to produce a plurality of resulting elements, said first and second vector comprising a plurality of N-bit elements;

an accumulator for receiving said plurality of resulting elements, wherein said plurality of resulting elements are each stored in one of a plurality of M-bit elements of said accumulator and wherein M is greater than N;

means for transforming said plurality of resulting elements in said accumulator into a width of N-bits;

means for dividing said plurality of resulting elements stored in said accumulator into a plurality of subsets; and

means for writing each subset to at least one of a plurality of registers, each of said plurality of registers having a width smaller than said accumulator width.

82. A processor for providing extended precision in single instruction multiple data (SIMD) arithmetic operations, comprising:

means for executing an arithmetic instruction involving an element of a first vector and an element of a second vector to produce a resulting element, said first and second vector comprising a plurality of N-bit elements;

an accumulator for receiving said resulting element, wherein said resulting element is stored in an M-bit element of said accumulator and wherein M is greater than N;

means for transforming said resulting element in said accumulator into a width of N-bits,  
wherein said means for transforming comprises means for shifting said resulting element in said  
accumulator for scaling the value of said resulting element, means for rounding said resulting  
element, and means for clamping said resulting element; and  
means for writing said transformed resulting element to a register.

## **EXHIBIT J**

## Supplemental Declaration for Patent Application

Docket Number: 1778.0110001

As a below named inventor, I hereby declare that:

My residence, mailing address and citizenship are as stated below next to my name.

I believe I am an original, first and joint inventor of the subject matter that is claimed and for which a patent is sought on the invention entitled **Method for Providing Extended Precision in SIMD Vector Arithmetic Operations**, the specification of which is attached hereto unless the following box is checked:

- was filed on December 30, 1998;  
as United States Application Number 09/223,046; and  
was amended on August 11, 2000; December 5, 2000; June 14, 2002; and October 13, 2004.

I hereby state that I have reviewed and understand the contents of the above identified specification, including the claims, as amended by any amendment referred to above.

I acknowledge the duty to disclose information that is material to patentability as defined in 37 C.F.R. § 1.56, including for continuation-in-part applications, material information which became available between the filing date of the prior application and the national or PCT international filing date of the continuation-in-part application.

I hereby claim foreign priority benefits under 35 U.S.C. § 119(a)-(d) or (f) or § 365(b) of any foreign application(s) for patent, inventor's or plant breeder's rights certificate(s), or § 365(a) of any PCT international application, which designated at least one country other than the United States of America, listed below, and have also identified below, by checking the box, any foreign application for patent, inventor's or plant breeder's rights certificate(s), or PCT international application having a filing date before that of the application on which priority is claimed.

| Prior Foreign Applications(s): |           |                        | Priority Claimed                                         |
|--------------------------------|-----------|------------------------|----------------------------------------------------------|
| (Application No.)              | (Country) | (Day/Month/Year Filed) | <input type="checkbox"/> Yes <input type="checkbox"/> No |
| (Application No.)              | (Country) | (Day/Month/Year Filed) | <input type="checkbox"/> Yes <input type="checkbox"/> No |

Send Correspondence to:  
Customer No. 26111  
STERNE, KESSLER, GOLDSTEIN & FOX P.L.L.C.  
1100 New York Avenue, N.W.  
Washington, D.C. 20005-3934

Direct Telephone Calls to: (202) 371-2600

I hereby declare that all statements made herein of my own knowledge are true and that all statements made on information and belief are believed to be true; and further that these statements were made with the knowledge that willful false statements and the like so made are punishable by fine or imprisonment, or both, under 18 U.S.C. § 1001 and that such willful false statements may jeopardize the validity of the application or any patent issued thereon.

|                              |                                           |
|------------------------------|-------------------------------------------|
| Full Name of First Inventor: | Timothy J. Van Hook                       |
| Signature of First Inventor: | Date:                                     |
| Residence:                   | Atherton, California                      |
| Citizenship:                 | U.S.A.                                    |
| Mailing Address:             | 224 Oakgrove Avenue<br>Atherton, CA 94027 |

389159\_1.DOC

## **EXHIBIT K**

## Don Featherstone - FedEx Shipment 792905387783 Delivered

**From:** Notifications@fedex.com  
**To:** <donf@skgf.com>  
**Date:** 4/28/2005 3:30 PM  
**Subject:** FedEx Shipment 792905387783 Delivered

Our records indicate that the following shipment has been delivered:

Tracking number: 792905387783  
Door Tag number: DT100682224252  
Reference: 1778.0110001  
Ship (P/U) date: Apr 25, 2005  
Delivery date: Apr 28, 2005 12:27 PM  
Signed for by: A.JALPA  
Service type: FedEx Standard Overnight  
Packaging type: FedEx Box  
Number of pieces: 1  
Weight: 1.0 LB

| Shipper Information            | Recipient Information |
|--------------------------------|-----------------------|
| Donald J. Featherstone         | Timothy J. Van Hook   |
| Sterne Kessler Goldstein & Fox | 224 Oakgrove Avenue   |
| 1100 New York Avenue, NW       | Atherton              |
| Washington                     | CA                    |
| DC                             | US                    |
| US                             | 94027                 |
| 20005                          |                       |

Special handling/Services  
Deliver Weekday

Please do not respond to this message. This email was sent from an unattended mailbox. This report was generated at approximately 2:29 PM CDT on 04/28/2005.

For questions about FedEx Express, please call us at 1.800.Go.FedEx.

All weights are estimated.

To track the status of this shipment online, please use the following:  
[http://www.fedex.com/Tracking?  
tracknumbers=792905387783&action=track&language=english&cntry\\_code=us&clienttype=ivpodalrt](http://www.fedex.com/Tracking?tracknumbers=792905387783&action=track&language=english&cntry_code=us&clienttype=ivpodalrt)

Thank you for your business.



Attorney Docket No.: SGI 15-4-458.00

## Declaration and Power of Attorney for a Patent Application

### Declaration

As below named inventor, I hereby declare that my residence post office address, and citizenship are as stated below my name. Further, I hereby declare that I believe I am the original, first and sole inventor (if only one name is listed below) or an original, first and joint inventor (if plural names are listed below) of the subject matter which is claimed and for which a patent is sought on the invention entitled:

A METHOD FOR PROVIDING EXTENDED PRECISION IN SIMD VECTOR ARITHMETIC OPERATIONS  
the specification of which:

..... is attached hereto, or  
 was filed on 10/9/97 as application serial no. 08/947,648 : and  
..... was amended on .....

I hereby state that I have reviewed and understand the contents of the above identified specification, including the claims, as amended by any amendment referred to above; and

I acknowledge the duty to disclose information which is material to the examination of this application in accordance with Title 37, Code of Federal Regulations, Section 1.56(a).

### Foreign Priority Claim

I hereby claim foreign priority benefits under Title 35, United States Code Section 119 of any foreign application(s) for patent or inventor's certificate listed below and have also identified below any foreign application for patent or inventor's certificate having a filing date before that of the application on which priority is claimed:

| Number | Country | Date Filed | Priority Claimed |
|--------|---------|------------|------------------|
| .....  | .....   | .....      | yes ..... no     |
| .....  | .....   | .....      | yes ..... no     |

### U.S. Priority Claim

I hereby claim the benefit under Title 35, United States Code, Section 120 of any United States application(s) listed below and, insofar as the subject matter of each of the claims of this application is not disclosed in the prior United States application in the manner provided by the first paragraph of Title 35, United States Code, Section 112, I acknowledge the duty to disclose material information as defined in Title 37, Code of Federal Regulations, Section 1.56(a) which occurred between the filing date of the prior application and the national or PCT international filing date of this application:

| Serial Number | Filing Date | Status (patented/pending/abandoned) |
|---------------|-------------|-------------------------------------|
| .....         | .....       | .....                               |
| .....         | .....       | .....                               |

## Power of Attorney

As a named inventor, I hereby appoint the following attorney(s) and/or agent(s) to prosecute this application and transact all business in the Patent Trademark Office connected therewith.

|                           |                                  |
|---------------------------|----------------------------------|
| James P. Hao .....        | Registration No.: 36,398 .....   |
| Anthony C. Murabito ..... | Registration No.: 35,295 .....   |
| John P. Wagner .....      | Registration No.: 35,398 .....   |
| Glenn D. Barnes .....     | Registration No.: P-42,293 ..... |
| Wilfred H. Lam .....      | Registration No.: P-41,923 ..... |
| Steve Weiner .....        | Registration No.: 38,330 .....   |
| Chris Byrne .....         | Registration No.: 32,204 .....   |
| Irene Fernandez .....     | Registration No.: 34,625 .....   |
| John Brigden .....        | Registration No.: 40,530 .....   |

Send Correspondence to:

**WAGNER, MURABITO & HAO**  
 Two North Market Street, Third Floor  
 San Jose, California 95113  
 (408) 938-9060

## Signatures

I hereby declare that all statements made herein of my own knowledge are true and that all statements made on information and belief are believed to be true; and further that these statements were made with the knowledge that willful false statements and the like so made are punishable by fine or imprisonment, or both, under Section 1001 of Title 18 of the United States Code and that such willful false statements may jeopardize the validity of the application or any patent issued thereon.

Full Name of Sole/First Inventor: Timothy van Hook

Inventor's Signature  Date 3/4/98  
 Residence Atherton, California Citizenship USA  
 (City Atherton State CA)  
 P.O. Address 224 Oakgrove Avenue, Atherton, California 94027

Full Name of Second/Joint Inventor: Peter Hsu

Inventor's Signature  Date 3/4/98  
 Residence Fremont, California Citizenship U.S.A.  
 (City Fremont State CA)  
 P.O. Address 2853 Welk Common, Fremont, California 94555

Full Name of Third/Joint Inventor: William A. Huffman

Inventor's Signature  Date 3/4/98  
 Residence Los Gatos, California Citizenship USA  
 (City Los Gatos State CA)  
 P.O. Address 15205 Roseleaf Lane, Los Gatos, California 95032

Full Name of Fourth/Joint Inventor: Henry P. Moreton

Inventor's Signature ..... Date .....  
Residence Woodside, California Citizenship USA  
(City State)  
P.O. Address 140 Phillip Road, Woodside, California 94062-2625

Full Name of Fifth/Joint Inventor: Earl A. Killian

Inventor's Signature ..... Date .....  
Residence Los Altos Hills, California Citizenship USA  
(City State)  
P.O. Address 27961 Central Drive, Los Altos Hills, California 94022



Attorney No.:SGI 15-4-458.00

## Declaration and Power of Attorney for a Patent Application

### Declaration

As below named inventor, I hereby declare that my residence post office address, and citizenship are as stated below my name. Further, I hereby declare that I believe I am the original, first and sole inventor (if only one name is listed below) or an original, first and joint inventor (if plural names are listed below) of the subject matter which is claimed and for which a patent is sought on the invention entitled:

A METHOD FOR PROVIDING EXTENDED PRECISION IN SIMD VECTOR ARITHMETIC OPERATIONS  
the specification of which:

..... is attached hereto, or  
 was filed on 10/9/97 as application serial no. 08/947,648 ; and  
..... was amended on .....

I hereby state that I have reviewed and understand the contents of the above identified specification, including the claims, as amended by any amendment referred to above; and

I acknowledge the duty to disclose information which is material to the examination of this application in accordance with Title 37, Code of Federal Regulations, Section 1.56(a).

### Foreign Priority Claim

I hereby claim foreign priority benefits under Title 35, United States Code Section 119 of any foreign application(s) for patent or inventor's certificate listed below and have also identified below any foreign application for patent or inventor's certificate having a filing date before that of the application on which priority is claimed:

| Number | Country | Date Filed | Priority Claimed |
|--------|---------|------------|------------------|
| .....  | .....   | .....      | yes ..... no     |
| .....  | .....   | .....      | yes ..... no     |

### U.S. Priority Claim

I hereby claim the benefit under Title 35, United States Code, Section 120 of any United States application(s) listed below and, insofar as the subject matter of each of the claims of this application is not disclosed in the prior United States application in the manner provided by the first paragraph of Title 35, United States Code, Section 112, I acknowledge the duty to disclose material information as defined in Title 37, Code of Federal Regulations, Section 1.56(a) which occurred between the filing date of the prior application and the national or PCT international filing date of this application:

| Serial Number | Filing Date | Status (patented/pending/abandoned) |
|---------------|-------------|-------------------------------------|
| .....         | .....       | .....                               |
| .....         | .....       | .....                               |

**Power of Attorney**

As a named inventor, I hereby appoint the following attorney(s) and/or agent(s) to prosecute this application and transact all business in the Patent Trademark Office connected therewith.

|                           |                                  |
|---------------------------|----------------------------------|
| James P. Hao .....        | Registration No.: 36,398 .....   |
| Anthony C. Murabito ..... | Registration No.: 35,295 .....   |
| John P. Wagner .....      | Registration No.: 35,398 .....   |
| Glenn D. Barnes .....     | Registration No.: P-42,293 ..... |
| Wilfred H. Lam .....      | Registration No.: P-41,923 ..... |
| Steve Weiner .....        | Registration No.: 38,330 .....   |
| Chris Byrne .....         | Registration No.: 32,204 .....   |
| Irene Fernandez .....     | Registration No.: 34,625 .....   |
| John Brigden .....        | Registration No.: 40,530 .....   |

Send Correspondence to:

**WAGNER, MURABITO & HAO**  
 Two North Market Street, Third Floor  
 San Jose, California 95113  
 (408) 938-9060

**Signatures**

I hereby declare that all statements made herein of my own knowledge are true and that all statements made on information and belief are believed to be true; and further that these statements were made with the knowledge that willful false statements and the like so made are punishable by fine or imprisonment, or both, under Section 1001 of Title 18 of the United States Code and that such willful false statements may jeopardize the validity of the application or any patent issued thereon.

Full Name of Sole/First Inventor: Timothy van Hook

Inventor's Signature ..... Date .....  
 Residence Atherton, California ..... Citizenship USA .....  
 (City Atherton State CA) .....  
 P.O. Address 224 Oakgrove Avenue, Atherton, California 94027 .....

Full Name of Second/Joint Inventor: Peter Hsu

Inventor's Signature ..... Date .....  
 Residence Fremont, California ..... Citizenship .....  
 (City Fremont State CA) .....  
 P.O. Address 2853 Welk Common, Fremont, California 94555 .....

Full Name of Third/Joint Inventor: William A. Huffman

Inventor's Signature William C. Huffman Date Mar 5, 1998  
 Residence Los Gatos, California ..... Citizenship USA .....  
 (City Los Gatos State CA) .....  
 P.O. Address 16205 Roseleaf Lane, Los Gatos, California 95032 .....

Full Name of Fourth/Joint Inventor: Henry P. Moreton

Inventor's Signature \_\_\_\_\_ Date \_\_\_\_\_  
Residence Woodside, California Citizenship USA  
(City                    State                   )  
P.O. Address 140 Phillip Road, Woodside, California 94062-2625

Full Name of Fifth/Joint Inventor: Earl A. Killian

Inventor's Signature \_\_\_\_\_ Date \_\_\_\_\_  
Residence Los Altos Hills, California Citizenship USA  
(City                    State                   )  
P.O. Address 27961 Central Drive, Los Altos Hills, California 94022



Attorney Docket No.: SGI 15-4-458.00

## Declaration and Power of Attorney for a Patent Application

### Declaration

As below named inventor, I hereby declare that my residence post office address, and citizenship are as stated below my name. Further, I hereby declare that I believe I am the original, first and sole inventor (if only one name is listed below) or an original, first and joint inventor (if plural names are listed below) of the subject matter which is claimed and for which a patent is sought on the invention entitled:

A METHOD FOR PROVIDING EXTENDED PRECISION IN SIMD VECTOR ARITHMETIC OPERATIONS  
the specification of which:

..... is attached hereto, or  
 was filed on 10/9/97 as application serial no. 08/947,648 : and  
..... was amended on .....

I hereby state that I have reviewed and understand the contents of the above identified specification, including the claims, as amended by any amendment referred to above; and

I acknowledge the duty to disclose information which is material to the examination of this application in accordance with Title 37, Code of Federal Regulations, Section 1.56(a).

### Foreign Priority Claim

I hereby claim foreign priority benefits under Title 35, United States Code Section 119 of any foreign application(s) for patent or inventor's certificate listed below and have also identified below any foreign application for patent or inventor's certificate having a filing date before that of the application on which priority is claimed:

| Number | Country | Date Filed | Priority Claimed |
|--------|---------|------------|------------------|
| .....  | .....   | .....      | yes ..... no     |
| .....  | .....   | .....      | yes ..... no     |

### U.S. Priority Claim

I hereby claim the benefit under Title 35, United States Code, Section 120 of any United States application(s) listed below and, insofar as the subject matter of each of the claims of this application is not disclosed in the prior United States application in the manner provided by the first paragraph of Title 35, United States Code, Section 112, I acknowledge the duty to disclose material information as defined in Title 37, Code of Federal Regulations, Section 1.56(a) which occurred between the filing date of the prior application and the national or PCT international filing date of this application:

| Serial Number | Filing Date | Status (patented/pending/abandoned) |
|---------------|-------------|-------------------------------------|
| .....         | .....       | .....                               |
| .....         | .....       | .....                               |

## Power of Attorney

As a named inventor, I hereby appoint the following attorney(s) and/or agent(s) to prosecute this application and transact all business in the Patent Trademark Office connected therewith.

|                           |                                  |
|---------------------------|----------------------------------|
| James P. Hao .....        | Registration No.: 36,398 .....   |
| Anthony C. Murabito ..... | Registration No.: 35,295 .....   |
| John P. Wagner .....      | Registration No.: 35,398 .....   |
| Glenn D. Barnes .....     | Registration No.: P-42,293 ..... |
| Wilfred H. Lam .....      | Registration No.: P-41,923 ..... |
| Steve Weiner .....        | Registration No.: 38,330 .....   |
| Chris Byrne .....         | Registration No.: 32,204 .....   |
| Irene Fernandez .....     | Registration No.: 34,625 .....   |
| John Brigden .....        | Registration No.: 40,530 .....   |

Send Correspondence to:

**WAGNER, MURABITO & HAO**  
 Two North Market Street, Third Floor  
 San Jose, California 95113  
 (408) 938-9060

## Signatures

I hereby declare that all statements made herein of my own knowledge are true and that all statements made on information and belief are believed to be true; and further that these statements were made with the knowledge that willful false statements and the like so made are punishable by fine or imprisonment, or both, under Section 1001 of Title 18 of the United States Code and that such willful false statements may jeopardize the validity of the application or any patent issued thereon.

Full Name of Sole/First Inventor: Timothy van Hook

Inventor's Signature ..... Date .....  
 Residence Atherton, California Citizenship USA  
 (City State) .....  
 P.O. Address 224 Oakgrove Avenue, Atherton, California 94027

Full Name of Second/Joint Inventor: Peter Hsu

Inventor's Signature ..... Date .....  
 Residence Fremont, California Citizenship .....  
 (City State) .....  
 P.O. Address 2853 Welk Common, Fremont, California 94555

Full Name of Third/Joint Inventor: William A. Huffman

Inventor's Signature ..... Date .....  
 Residence Los Gatos, California Citizenship USA  
 (City State) .....  
 P.O. Address 16205 Roseleaf Lane, Los Gatos, California 95032

Full Name of Fourth/Joint Inventor: Henry P. Moreton

Inventor's Signature  Date 3/6/98

Residence Woodside, California Citizenship USA  
(City) State

P.O. Address 140 Phillip Road, Woodside, California 94062-2625

Full Name of Fifth/Joint Inventor: Earl A. Killian

Inventor's Signature \_\_\_\_\_ Date \_\_\_\_\_

Residence Los Altos Hills, California Citizenship USA  
(City) State

P.O. Address 27961 Central Drive, Los Altos Hills, California 94022



Attorney Docket No.: SGI 15-4-458.00

## Declaration and Power of Attorney for a Patent Application

### Declaration

As below named inventor, I hereby declare that my residence post office address, and citizenship are as stated below my name. Further, I hereby declare that I believe I am the original, first and sole inventor (if only one name is listed below) or an original, first and joint inventor (if plural names are listed below) of the subject matter which is claimed and for which a patent is sought on the invention entitled:

A METHOD FOR PROVIDING EXTENDED PRECISION IN SIMD VECTOR ARITHMETIC OPERATIONS  
the specification of which:

..... is attached hereto, or  
 was filed on 10/9/97 as application serial no. 08/947,648 : and  
..... was amended on .....

I hereby state that I have reviewed and understand the contents of the above identified specification, including the claims, as amended by any amendment referred to above; and

I acknowledge the duty to disclose information which is material to the examination of this application in accordance with Title 37, Code of Federal Regulations, Section 1.56(a).

### Foreign Priority Claim

I hereby claim foreign priority benefits under Title 35, United States Code Section 119 of any foreign application(s) for patent or inventor's certificate listed below and have also identified below any foreign application for patent or inventor's certificate having a filing date before that of the application on which priority is claimed:

| Number | Country | Date Filed | Priority Claimed |
|--------|---------|------------|------------------|
| .....  | .....   | .....      | yes ..... no     |
| .....  | .....   | .....      | yes ..... no     |

### U.S. Priority Claim

I hereby claim the benefit under Title 35, United States Code, Section 120 of any United States application(s) listed below and, insofar as the subject matter of each of the claims of this application is not disclosed in the prior United States application in the manner provided by the first paragraph of Title 35, United States Code, Section 112, I acknowledge the duty to disclose material information as defined in Title 37, Code of Federal Regulations, Section 1.56(a) which occurred between the filing date of the prior application and the national or PCT international filing date of this application:

| Serial Number | Filing Date | Status (patented/pending/abandoned) |
|---------------|-------------|-------------------------------------|
| .....         | .....       | .....                               |
| .....         | .....       | .....                               |

**Power of Attorney**

As a named inventor, I hereby appoint the following attorney(s) and/or agent(s) to prosecute this application and transact all business in the Patent Trademark Office connected therewith.

|                           |                                  |
|---------------------------|----------------------------------|
| James P. Hao .....        | Registration No.: 36,398 .....   |
| Anthony C. Murabito ..... | Registration No.: 35,295 .....   |
| John P. Wagner .....      | Registration No.: 35,398 .....   |
| Glenn D. Barnes .....     | Registration No.: P-42,293 ..... |
| Wilfred H. Lam .....      | Registration No.: P-41,923 ..... |
| Steve Weiner .....        | Registration No.: 38,330 .....   |
| Chris Byrne .....         | Registration No.: 32,204 .....   |
| Irene Fernandez .....     | Registration No.: 34,625 .....   |
| John Brigden .....        | Registration No.: 40,530 .....   |

Send Correspondence to:

**WAGNER, MURABITO & HAO**  
 Two North Market Street, Third Floor  
 San Jose, California 95113  
 (408) 938-9060

**Signatures**

I hereby declare that all statements made herein of my own knowledge are true and that all statements made on information and belief are believed to be true; and further that these statements were made with the knowledge that willful false statements and the like so made are punishable by fine or imprisonment, or both, under Section 1001 of Title 18 of the United States Code and that such willful false statements may jeopardize the validity of the application or any patent issued thereon.

Full Name of Sole/First Inventor: Timothy van Hook

Inventor's Signature ..... Date .....  
 Residence Atherton, California ..... Citizenship USA .....  
 (City Atherton State CA) .....  
 P.O. Address 224 Oakgrove Avenue, Atherton, California 94027 .....

Full Name of Second/Joint Inventor: Peter Hsu

Inventor's Signature ..... Date .....  
 Residence Fremont, California ..... Citizenship .....  
 (City Fremont State CA) .....  
 P.O. Address 2853 Welk Common, Fremont, California 94555 .....

Full Name of Third/Joint Inventor: William A. Huffman

Inventor's Signature ..... Date .....  
 Residence Los Gatos, California ..... Citizenship USA .....  
 (City Los Gatos State CA) .....  
 P.O. Address 16205 Roseleaf Lane, Los Gatos, California 95032 .....

Full Name of Fourth/Joint Inventor: Henry P. Moreton

Inventor's Signature .....

Residence Woodside, California Citizenship USA Date .....

(City Woodside State CA)

P.O. Address 140 Phillip Road, Woodside, California 94062-2625

Full Name of Fifth/Joint Inventor: Earl A. Killian

Inventor's Signature Earl A. Killian Date 20 March 1998

Residence Los Altos Hills, California Citizenship USA Date .....

(City Los Altos Hills State CA)

P.O. Address 27961 Central Drive, Los Altos Hills, California 94022